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This text is an introduction to an operational outlook on Bell inequalities, which 
has been very fruitful in the past few years. It has lead to the recognition that Bell 
tests have their own place in applied quantum technologies, because they quantify 



O ^' non-classicality in a device-independent way, that is, without any need to describe 

, the degrees of freedom under study and the measurements that are performed. 

At the more fundamental level, the same device-independent outlook has allowed 
the falsification of several other alternative models that could hope to reproduce 
the observed statistics while keeping some classical features that quantum theory 
. denies; and it has shed new light on the long-standing quest for deriving quantum 

theory from physical principles. 
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1.1 A parable... 

A physicist turned bureaucrat was sent to audit the quantum laboratories of a university. 
He entered the first lab and found a student intent in his experiment. 

- "Good morning. I was told that you do exciting physics here. What are you working 
on?" 

- "This is a Stern-Gerlach experiment. I send a beam of atoms through this inhomoge- 
neous magnetic field and measure the spin, a purely quantum effect." 

- "Nice! This was advanced material in my undergraduate days. You needed that cum- 
bersome Dirac notation. Can you remind me what one looks for in such an experiment?" 

- "The theory says that, if the spin points in direction n and you orient the magnet in a di- 
rection a, the probability of finding the atom in the upper beam is P{+a\fi) = ^(1 + a-h). 
I am checking these predictions and it works really very well." 

The student showed his skills and those of his machine by running the experiment. The 
match with the prediction was indeed remarkable. The bureaucrat was impressed and 
asked: 

- "Can you remind me how we conclude that this is a quantum effect?" 

- "Well... you see, the result of the measurement is discrete..." 

- "But so it would be if I would toss a coin and observe head or tail." 

- "Right, but the spin measurement is random. Take your example: the coin looks ran- 
dom because we don't control all the parameters; but here, the outcome is really random. 
We can't predict individual events." 

The bureaucrat paused to think, and after a while said: 

- "I am not convinced. May I try something? — No, I won't touch your delicate ap- 
paratus. I plan even to go out of this room. I shall simulate a source of spins pointing 
in the z direction. You stay inside, write the measurement direction on a piece of paper 
and pass it to me below the door. I'll try to reply as the spins would." 

The student watched the bureaucrat close the door. Then took a piece of paper, wrote 
a — iz + on it and slipped it under the door. Soon afterwards, a small paper 
square ficw in: on it was written up] then another, and another, and another... up, 
down, down, up, down, up, up, ... building up the expected statistics P{up) — 75% and 
P{down) = 25%. After a while, the bureaucrat came back in. 
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- "So, am I a quantum source? Really random?" he said with a wry smile. 

The student was flummoxed. He had been told that the Stern-Gerlach magnet produced 
real randomness. It came back to his mind how Feynman and Schwinger, in their text- 
books, use precisely that experiment to build up quantum theory itself. Seeing his idols 
in danger of being brought down, he became mildly aggressive: 

- "Your tricky game is unfair. It just proves that quantum theory is predictive, that we 
know what to expect..." 

The bureaucrat did not wait for the end and went away, stunned by his smartness. The 
student was left to ponder the end of his own sentence. Indeed: we know what to expect! 
If this information exists (or can be easily computed by a bureaucrat), why have all been 
told to believe that the source of atoms does not possess that information in advance? 

After a well deserved lunch, the bureaucrat went back to his task and knocked at the 
door of the second lab. A girl emerged from obscurity and politely asked what was the 
matter. The bureaucrat introduced his auditing role and went straight to the point: 

- "Are you also doing some fundamental experiment here?" 

- "Yes, sir. I am observing the violation of Bell inequalities by two entangled photons." 

- "And you are surely convinced that this is really quantum, are you not? Show me the 

statistics you are getting and I shall show you something funny." 

- "Ah, sir, you are certainly the one who gave trouble to my friend this morning. He 
told me what you did, it was intriguing indeed. But you can't do the same here, it would 
be unfair." 

- "Unfair? Why? Because you want to keep your blind faith in what you were taught? 
This is science, not -" 

- "That is not the reason, sir. You see, here we have two photons, and each is measured 
independently at a different location. Each photon cannot possibly know which mea- 
surement we perform on its twin, because we choose them at the last moment and the 
information does not have time to reach the other location." 

- "And so..." 

- "And so, if your simulation has to be fair, I can't give the information about both 
measurements to you alone. You have to come with a friend. Each of you will have to go 
to a different room, and without your mobile telephones. Then, I give one measurement 
to you and another to your friend." 

- "This way, I simulate one photon and my friend the other, right? Fair enough. Now, 

who -" 

A voice came from the corridor: 

- "Hey, what a surprise? What are you doing here?" 

It was a former classmate of the bureaucrat, who had persevered in the academic world 
and had specialized in mathematical physics. He immediately volunteered to play the 
friend in the simulation. They sat down together, came up with a simple strategy and 
underwent the test — but, when they checked their simulation with the student, they 
found they had failed to reproduce the expected statistics for some of the measurements. 
Intrigued by the challenge, they went to th(^ cafeteria and prepared a more elaborated 
common strategy while sipping a good coffee. The bureaucrat asked: 

- "By the way, what is this violation of Bell inequalities the girl was speaking about? 
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There is a great fuzz about it in some blogs that I fohow." 

- "Oh, it's just an obvious consequence of describing two opposite spins ^ in a C*-algebra. 
Finite-dimensional, non-relativistic quantum mechanics - as boring as it gets." 
They went back to the lab and tried again the simulation. It went pretty well, most 
of the statistics were indeed correct; but not all, and the simulation of some pairs of 
measurements was wide off the mark. However, the time had come at which bureaucrats 
are asked to stop working. He took leave from the student and from his friend and 
walked out in the fresh evening. He had had a very pleasant day out of office: he would 
recommend the funding for both projects to be renewed. 

1.2 ... and its meaning 

The bureaucrat and his friend could have tried much longer: they would not have suc- 
ceeded in simulating the statistics observed in the experiment with entangled particles. 
This statement is the operational meaning of the violation of Bell inequalities. This text 
will illustrate how exciting physics arises from this clean operational approach to Bell 
inequalities. 

It may be superfluous to pass explicit judgment on the characters of the story, insofar 
as they were playing each their natural role. Let me do it anyway, for the sake of those 
younger readers who may not be acquainted with the academic world: 

• The mathematical physicist got it wrong. The violation of Bell inequalities is not a 
straightforward exercise in finite-dimensional quantum mechanics. It is a criterion 
independent of quantum physics. Finite-dimensional quantum mechanics provides 
excellent agreement with the observation, which should lead to appreciation rather 
than dismissal. 

• The bloggers won't be able to separate light and darkness from the primordial 
chaos in which they are immersed. To be fair, there may be something exciting at 
the philosophical level about the violation of Bell inequalities. However, we won't 
need to look outside physics to appreciate the power of Bell. 

• The students are in the process of getting it right: they just have to undergo the 
critical phase transition when they stop believing blindly in their supervisors and 
switch their own brains on (not to antagonise the supervisor, but to appreciate the 
supervisor's wisdom while building up an independent wisdom of their own). 

• The bureaucrat got it absolutely right in deciding to continue funding research. 

1.3 A user's guide to this text 

This text is an editing of the lecture notes of a module I am teaching within CQT's 
graduate programme. It is not a review paper: I recently co-authored one such paper 
[Brunner et al. 2013] and the reader is encouraged to refer to it for a comprehensive 
overview of the research in this field. Here, I have done a clear selection of material, 
focused on simple examples and kept the references to a minimum. 
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The readers are supposed to be familiar with quantum formahsm. If this were not the 
case, they can refer to any of the excellent books and courses which have gone a great 
didactical length to introduce those notions carefully. For instance, [Preskill notes] are 
in open access. 

2 Bell inequalities as an operational notion 

2.1 Introductory matters 

2.1.1 Bell experiments 

This text deals with the description of a very specific class of experiments sketched in 
Fig. [TJ Two parties Alice and Bob are at distinct locations. Each has a measurement 
device, which shall be treated as a black box with an input (say, a knob, to choose 
the measurement setting) and an output (to record the result). In each run of the 
experiment, each party sets the knob at a randomly chosen position and receives an 
outcome. After repeating the procedure several times, Alice and Bob come together 
(or exchange information via communication) and compute the joint statistics of their 
observation^ 





Fig. 1. Bipartite Bell experiment. Notice that we do not need to specify a "channel" between the 
locations: the boxes may have been pre-loaded with shared information (classical or quantum), 
but in the Bell test Alice and Bob act in a completely independent way. 



Such experiments have been repeatedly realized in the last few decades but were 
first proposed as Gedankenexperimente. The famous 1935 paper by Einstein, Podolski 
and Rosen [Einstein, Podolski and Rosen 1935| is obviously a precursor, but the setup 
they presented would not allow to have two spatially separated measurement stations 
(see Appendix which, as we shall see, is crucial in our present understanding. This 

■^In this text, Alice and Bob are always the verifiers that operate the black boxes: their role is the 
one described here, namely to choose measurement settings, record outcomes and compute statistics. In 
some papers, Alice and Bob are rather those who receive the settings from a referee and are supposed 
to produce outcomes according to the desired statistics (think of an experimentalist in each box, who 
has control over the internal mechanisms). In this paper, only in paragraph 15. 2. l1 will it be convenient 
to give names to the simulators, and I shall use Anthony and Beatrix for that. 
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became possible only when David Bohm in 1951 rephrased the EPR argument using 
entanglement of internal degrees of freedom. It is essentially on Bohm's description 
that most of the subsequent work relies, including certainly the work of John Bell that 
eventually put the discussion on the right grounds |Bell 1964] , Nowadays, this class of 
experiment is referred to as EPR (or EPR-like) experiments, Bell-Bohm experiments, or 
simply Bell experiments; I shall use the latter. 

Let us fix the notation: the possible measurement setting for Alice will be denoted 
X & X = {1, Ma}, her outcome a £ A — {1,...,toa}- Similarly, Bob's setting is 
y £ y = {I,..., Mb} and his outcome is b £ B = {1,...,™^}. Two points are worth 
stressing: 

• Alice and Bob are not requested to know what each input (each position of the 
knob) corresponds to inside the device. For all they know a priori, it is even possible 
that all the positions of the knob correspond to the same physical operation inside 
the box; of course, if this is the case, a posteriori they will observe that their joint 
statistics do not change with the input. 

• The fact that the outcomes are discrete does not imply that quantum degrees of 
freedom, let alone finite-dimensional ones, are being measured: the boxes may 
be measuring the frequency of light beams with devices that group the results 
according to the traditional seven colors of the rainbow. 

For simplicity of the presentation, let us temporarily assume that the behavior of the black 
boxes is the same in each run (I'll show how to remove this assumption in paragraph 



The statistics Vx.y will be called the observed statistics: strictly speaking, any exper- 
iment involves a finite number of samples and what is really observed is an estimate, 
whose accuracy can be quantified with the usual statistical techniques. 



where p{X\x,y) > 0, / d\ p{X\x,y) — 1 and where all the P{a,b\x,y, X) are valid proba- 
bility distributions. Here, A can be called "one's favorite explanation" : the mathematical 
description of the process one wants to invoke to explain the observed statistics. 



For instance, if quantum theory is one's favorite explanation, one will look for a state 
p, for Ma POVMs M'-" = {E'^\a € A} and for Mb POVMs = {E^\b e B}, such that 



^AU along this text, I shall use the notation of conditional probabilities P{a, b\x, y), to be read "the 
statistics of the outcomes, given the settings" . The distribution of settings x, y will never be used: for 
the purposes of this text, one could just as well treat the settings as parameters labeling probability 
distribution and write Px^y(a,b) instead. 




2.1.2 Describing the observed statistics 



Without loss of generality, we can write 




(2) 
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p{\\x, y) = 5{\ — p) and P(a, y, A) = Tv{\E^ ® El). Thus, one obtains the famihar 
expression from quantum theory 

PQ{aMx,y) = Tv{pE:®El). (3) 

Another important class of explanations are deterministic explanations, in which the 
outcomes are uniquely determined by the inputs; 

P(a, 6|a;, J/, A) = (5(a,b)=F(a;,i/, A) = 5a=f(x.y.X)5b=g(x,y,\)- (4) 

The second expression says that, if the pair (a, b) is uniquely determined from the input, 
then a is uniquely determined and b is uniquely determined. 

2.1.3 Last synopsis before the real start 

In the remainder of this text, unless stated otherwise, I shall assume that quantum 
theory gives accurate predictions or, in other words, that all the observed statistics can 
be obtained with quantum theory and written in the form ([3]). The goal is to put quantum 
theory to the test by not taking it as our favorite explanation and trying alternative ones. 
As we are going to see, all the alternatives that have been tried fail at some point. This 
brings two scientific benefits: a falsification of the apparently reasonable ideas that lead 
to formulate the alternative explanations; and a strengthening (if not a confirmation) of 
quantum theory itselfl. 

Concretely, there is general agreement that the explanation of Bell experiments fully 
compatible with classical prejudices is given by the so-called local-variable (LV) models. 
Most of this text is devoted to presenting this classical explanation, its failure to repro- 
duce observed statistics (captured by the observation of the violation of Bell inequalities) 
and the consequences that this fact entails. Once a fully classical explanation is ruled 
out, one can try to save at least some features of our classical intuition. Remarkably, 
even those such models that have been proposed fail to reproduce all the predictions of 
quantum theory (more in Section [5]). 

2.2 Pre-established agreement a.k.a. local variables (LV) a.k.a. shared 

randomness 

Correlations between distant parties are commonplace in our classical world. For in- 
stance, all the agents of a bank delegated to various stock markets start simultaneously 
selling or buying some bonds, depending on an input which could be the result of a polit- 
ical election. There is no miracle in this simultaneity: either they all received a message 
from the central office or the senior agent; or, even more probably, they had agreed in 
advance on how to behave. This example illustrates the only two classical mechanisms 
that explain correlations between distant parties: communication (a.k.a. signaling) and 

^While the first benefit is undeniable, tlie second is debatable: in absolute terms, quantum physics is 
first and foremost strengthened by the unparalleled scope of its successful predictions. However, there 
is a benefit in addressing the core tenets of quantum theory in a simple and direct way, instead of just 
being crushed under the sheer mass of its achievements — and, in my experience, this is certainly the 
way to follow in outreach beyond the physics community. 
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pre-established agreement. In fact, both mechanisms can be grouped under the single 
header of "common cause in the past" ; but for the purpose of this discussion, it is useful 
to treat the two mechanisms separately. 

2.2.1 Formal characterization of local variables 

Let us leave signaling aside for the moment and let us make pre-established agreement our 
tentative favorite explanation. This can be translated into three independent constraintf[f| 
on the expressions that appear in ([2]): 

1. Outcome independence: once A is given, each agent can act alone upon receiving 
the inputs. Mathematically it reads 

P{a,b\x,y,\) = Pia\x,y,X)P{b\x,y,\). (5) 

In other words, any observed correlation should be explained by the statistical 
distributiorl^ of A's. As we have seen in deterministic explanations automati- 
cally fulfill outcome independence; conversely, an explanation that does not fulfill 
outcome independence must contain some intrinsic randomness. 

2. No-signaling: as we said, we want to keep the two classical explanations separate. 
Therefore we ask that Alice does not learn Bob's input, nor Bob Alice's: 

Pia\x,y,X) = Pia\x,X) , P{b\x,y, X) = P{b\y, X) . (6) 

One may think that this condition fails in the example with the bank agents, insofar 
as they are supposed to receive the same input (the result of the elections). But 
this is not what is said here: what we are requesting is that no agent knows which 
inputs the other agents actually got. If the wrong information about the winner of 
the elections is provided in one of the locations, the corresponding agent will act 
accordingly, without noticing that he is at odds with all the others. 

3. Measurement independence: this is not a constraint on P but on p{X\x, y) and says 
that the choice of A should not depend on the input; which is quite natural because 

follow the presentation of IHall 20111 . which belongs to a long series of works dating back to a 
study by Jarrett in 1988, because it suits the structure of this text. It is fair to mention here that 
some very competent people strongly disagree with such an approach. In a nutshell, they consider that 
the LV assumption should be primitive: that is, according to this view, any partition of the LV into 
further assumptions is arbitrary. The real physical interest of Bell inequalities would lie in the tension 
it generates with relativity. For an example of such a position, which probably matches Bell's own, see 
e.g. INorsen 20091 . As for myself, I stand strongly for the conviction that Bell inequalities are one of 
the deepest results in physics. As such, I am in fact happy to see that competent people adopt different 
views on why they are interesting and have different opinions on how to introduce them: for further 
inspiring reading, one can refer to IGisin 20091 [Wood an d Spckkc ns^0T2l IGill 20121 . 

®Some people are astonished at this, because they naively thought that statistical distributions can 
only wash out correlations, and not create them. Of course, it depends on what one averages over: 
averaging over the outcomes definitely washes correlations out; averaging over the instructions A is the 
origin of correlations in scenarios of pre-established agreement. 
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the input is supposed to be received at a later time. So 

Pi\\x,y) - p(A). (7) 

By Bayes' rule, this implies that the distribution of {x,y) is independent of A. In 
short, the agreement of the bank agents and the actual winner of the elections 
are chosen by independent processes; as an operational corollary, the agents must 
prepare a strategy for all possible winners, even if one winner is a priori more 
probable than the others. 

When we put everything together, we obtain the mathematical characterization of the 
explanation through pre-established agreement: 

PLvia,b\x,y) = j dXp{\)P{a\x,\)P{b\y,X). (8) 

I have just used the notation "LV" to mean local variables, the expression used in physics 
to refer to pre-established agreement, which I shall use for its compactness though I find 
it mildly confusing. In computer science, pre-established agreement is usually called 
shared randomness. 



2.2.2 Deterministic local variables 

In the example, we have naturally assumed that the agents know exactly what to do 
when they get to know the winner of the elections. No such restriction was imposed on 
the mathematics: P(a\x, A) and P{b\y, A) have only been required to be valid probability 
distributions. Deterministic local variables are a special case of LV in which, for any 
input, the outcome is fully determined by A: 

P{a\x, A) ^ 5a=f(x,X) , P{b\y, A) ^ ^b=g{y,\) ■ (9) 

An equivalent way of characterizing deterministic local variables consists in just giving 
the list of outcomes for all possible inputs: 

A H {ai,a2,...,aAu;6i,62,...,&M^} e^l-^l xSl^l, (10) 

the link with the previous notation being = /(x. A) and by = g{y,X). From this 
notation, it is obvious that the number of deterministic local points is rn^"^ m^'^ . 

The importance of deterministic LV is provided by the following observation, first 
proved in [Fine 1982j : 

'^The traditional expression, still widely used, is local hidden variables, which carries its unfortunate 
weight of mysticism. Notably, the adjective "hidden" is irrelevant: the characterization we have just 
given is independent of whether A is kept hidden or is disclosed publicly. The adjective "local" , it is 
meant to convey two messages: (i) the pre-established agreement could have been worked out when all 
the agents were at the same location (in physics, when the signals to be measured were created in the 
source); and (ii) later, each agent acts according only to the information available at its own location. 
"Locality" is therefore relevant, but must be understood in a precise sense. 
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Proposition 2.1. A family of probability distributions Vx.y can be explained with pre- 
established agreement if and only if it can be explained with deterministic local variables. 

Proof. The "if direction is obvious. For the reverse, for any fixed A, we are going to 
construct a deterministic model that gives the same statistics as the initial stochastic 
modell. Let's label A = {1,2, ...,m — A} for convenience: the cumulative distribution 
S(a) — J2a<a P{<^\xj ^) can obviously be computed in the LV model. Let us then add a 
new local parameter fj,A drawn at random between and 1, then output a according to 
the following deterministic rule: 

F,(a|x,A,M) = ( I if S(a-l)<MA<S(a), ^^^^ 
^ ' ' '^^ \ otherwise. ^ ' 

If fi is drawn with uniform distribution, the original stochastic distribution is recovered: 

nl |.S(a) 

dfj. Pd{a\x, X, fi) — dii = P{a\x,X). 

lO ^S(a-l) 

Therefore (HJ can be rewritten as 

PLv{a,b\x,y) = / dXp{X) / dfiA / dfiB Pd{a\x, X, fiA)Pd{b\y, X, fis) , (12) 
J Jo Jo 

which is the desired convex sum of deterministic LV for the enlarged variable A' = 
{X, ^lA, Ii-b) with distribution p'{X')dX' = p{X)dXdpAdnB- CH 

It follows immediately that the finite set of A's defined by the rn^"^ deterministic 
local point is sufficient to describe any LV statistics. Indeed, each Pd{a\x, A,/i) given in 
(jlip is one of the m^"^ deterministic points for A; and similarly for B. By grouping the 
terms according to the local deterministic points, becomes 

Ma M„ 

PLvia,b\x,y) = ^ ^ PjkSa=f,{x)Sb=g^iy) (13) 
j=i k=i 

with A EE (j, k) and J2j,k Pjk = 1- 

Therefore LV statistics can always be explained by a deterministic model. Of course, 
this does not mean that such an explanation must necessarily be adopted: your favorite 
explanation, as well as the "real" phenomenon, may not involve determinism. For in- 
stance, as we shall see soon, measurement on separable quantum states leads to LV 
statistics, but this does not make quantum theory deterministic (if that is your favorite 
explanation), nor forces us to believe that the physical phenomenon "out there" is de- 
terministic. 

Finally, Proposition 12.11 has an interesting translation using the notation (fTO|) : 



®An interesting mistake (because I committed it myself!) is to believe that the proof is trivial, because 
it would just be the fact that any probability distribution can be seen as the result of ignorance. This 
reasoning would lead to the decomposition P{a\x,X) = '^ci=/(a;,/j,A) where /i labels one of 

the deterministic points. But qii{x, A) depends on x a priori: if we insert this expression, and the analog 
one for B, into l(8]l, we get a dependence on x and y in the distribution p' {X, /ij^, fig). This is why the 
proof is not entirely trivial. 
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Corollary 2.1. Vx,y can be explained with pre-established agreement if and only if there 
exist a joint probability distribution P(ai, ...uma'^ bi, bMs)) such that each P{a, b\x, y) 
can be obtained as the two-party marginal 

PLv{a,b\x,y) = X! X! P(ai, ••■Oma; ^Mb) ■ (14) 

{a,\j^x} {b^lk^y} 

2.3 The power of LV 

At this stage, physicists usuaUy hurry forward and prove that quantum statistics cannot 
be explained with LV. This is of course my goal as well, but too much haste in taking 
this step may convey the impression that the LV models are silly and artificial, and that 
it is only normal that quantum correlations don't fit in that class. I want to devote one 
paragraph to explain what LV models could do for you, beyond explaining coordinated 
behavior in stock market bidding. 

• We have described the Bell experiment using two parties because quantum statistics 
involving one party can always be reproduced with LV; which means, according to 
what we just proved, that they can even be reproduced by a deterministic explana- 
tion. The reason is that A may contain the description of the quantum state p: one 
can compute on paper the probability distribution for any measurement, then just 
draw the outcomes according to that distribution using pseudo-randomness. As the 
opening parable illustrates, this statement is obvious once one thinks freely about 
it, but is nevertheless puzzling for someone (the student) who has been (de)formed 
in thinking that single-party phenomena already unveil the intrinsic randomness of 
the quantum. To be sure, one can discover many quantum effects in single-party 
measurements, but not in a black-box scenario: further assumptions are needecj^. 

Though anecdotical, it is instructive to present a pretty compact LV model that 
reproduces exactly the quantum statistics of a single qubit. If the qubit is in the 
state ^{1 -\- m ■ a), the quantum prediction for measurement along direction a is 

P{a\a) = —(1 + am ■ a) , that is {a)s ~ rh ■ a (15) 

for aG{ — In the LV model, the system is represented by m and by a unit 
vector A uniformly distributee^^ on the sphere, i.e. p{X)dX — sm9d9d(p. For 

®The knowledge of the physical degree of freedom usually provides the additional assumption. For 
instance, the blackbody radiation or the Stern-Gerlach experiment can be proved to be non-classical once 
one knows that the electromagnetic field, respectively a magnetic moment, is being measured. Similarly, 
there are often very good reasons to describe the system as composite, even if all the components 
contribute to the same signal: the prototypical examples come from condensed matter physics, where 
one wants to describe conductivity, magentization etc as arising not from an unspecified piece of matter, 
but from an arrangement of many atoms. Tests like "contextuality" a la Kochen-Specker, or sequential 
measurements a la Feynman or Leggett-Garg, need a minimal amount of assumptions to prove that the 
outcomes do not come from a pre-established agreement. Indeed, no detailed knowledge of the system 
and the measurement is needed, but one must assume that the measurement device does not write in, 
nor reads from, other degrees of freedom than the relevant one. 

^"Nothing in this model requires A to be "drawn at random" in each run: the sequence of A may be 
pre-registered. 
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each A, the outcome is deterministicahy computed to be 

a(A) — sign[(m — A) • a] (16) 
which is either +1 or —1 as it should. It is tlien easy to prove!"] that 

dXp{X)a{X) m • a. (17) 

Another nice illustrative example is the calculation of the double slit experiment with 
Bohmian trajectories, which reproduces the quantum interferences while being also 
able to say through which slit the particle has passed. To conclude on a personal 
note: if quantum physics would consist only of single-party measurements, I would 
not see any compelling reason to believe in its intrinsic randomness. 

• LV can also simulate several bipartite scenarios. One of the most obvious 'Px,y 
that can be described with LV is exactly the one that is presented in popular lore 
as an astonishing feat of quantum entanglement: the fact that, when they share a 
singlet state, Alice and Bob observe always opposite results when they measure in 
the same direction, i.e. 

(«6)s=?=-l- (18) 
Indeed, any list 

A = {...,0^,00, ...,6a, 6iT, ••■} (19) 

deterministically obtains (|18|) as soon as = — fe« for all possible directions H. 

If now we want to create correlations, we need to average over various A. For the 
sake of illustration, let us restrict ourselves to two possible measurements on each 
side, so A = {as, as; bs, b^}. In order to satisfy (fT8| we can have 



Ai = (+,+ 

A2 = (+,- 

A3 = (-,+ 

A4 = (-,- 



(20) 



each drawn with probability P(Afc) = qk- Moreover, any choice such that qi = q^ 
and 92 = <?3 = 5 — 9i reproduces (a)^ = (a)^ = 0. By choosing qi — j{l + u ■ 
v), we can even reproduce the quantum predictions for the case where the two 
measurements are not the same, i.e. {ab)s.v = {o.b)v^u = ■ v. 

This result is more striking than its simplicity suggests. Consider the case u J- v, 
for instance u = x and v — y: the statistics we have just reproduced with LV are the 



^^Without loss of generality, one can choose a = z = {9 = 0, ip), since nothing else in the problem 
specifies the choice of spherical coordinates. 
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expected correlations for an error-free run of the BB84 protocol for quantum key 
distributioilll! This means that the security proofs of that protocol must be based 
on addditional assumptions, other than the mere observation of those statistics, 
and indeed upon closer inspection it assumes that qubits are being measured (this 
observation was crucial in the emergence of the device-independent outlook, see 
Appendix IB)) . 

Ultimately, everything will fall into place: we are going to prove that the whole 
set of quantum predictions (a6)-g = —a - b for any pair of directions cannot be 
reproduced with LV. Before that, please bear with one more example of the power 
of LV — and above all, never again invoke perfect anticorrelations as an evidence 
for entanglement. 

• Our second example of bipartite statistics that can be simulated with LV are those 
in which one of the parties performs only one measurement. Once again, when 
one thinks about it, it is clear that this should be the case. Indeed, the set of 
possible measurements for both boxes can be part of the information in A. If 
Alice's set contains a single element, \X\ = 1, then Alice's actual measurement in 
each individual run is also known. In this case, A can determine Alice's outcome and 
distribute Bob's outcome accordingly for all his possible measurements. Therefore, 
in order to rule out an explanation by pre-established agreement, one must consider 
families Vx,y such that both \X\ > 1 and |3^| > 1. 

2.4 Bell inequalities and their violation 

2.4.1 Bell inequalities as facets of the local polytope 

The starting point for our study of Bell inequalities is the following observation: for 
any fixed scenario {X,A]y,B)^ the set C of all the families of probability distributions 
that can be obtained with LV is convex. In other words, if Vi (z C and V2 G ^, then 
qVi + (1 — q)'P2 G 'C for all q G [0, 1]. This is clear from the interpretation: A can contain 
the information of whether Vi or V2 is realized in each run. Presently we need to describe 
the convex set £ in more detail. 

A convex set is fully determined by specifying all its extremal points, i.e. those 
points that cannot be written as convex combinations of other points. We know from 
Proposition 12 . II that any V ^ C can be written as a convex sum of deterministic LV, and 
it is easy to convince oneself that each deterministic local point is an extremal point of C. 
Moreover, there are finitely many such points, precisely ''^^'^ j as we noted above. 
A convex set with finitely many extremal points is concisely referred to as "polytope" , 
so £ will be called the local polytope for the scenario {X ,A\y ,B). 

A polytope C embedded in is delimited by (I? — l)-dimensional hyperplanes called 
facets. Such a hyperplane must have at least D extremal points lying on it, while all the 
other extremal points must be found on the same side (if some extremal points are on 
one side and others on the other, then the hyperplane cuts through the polytope and is 



^•^In BB84, one expects correlations rather than anticorrelations; but it is trivial to obtain the former 
from the latter by classical post-processing: for instance, one can ask Bob to flip systematically his bit. 
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not a facet). Mathematically, let n e is the vector normal to the facet and oriented 
outside the polytope: if the equation for the points V of the facet is n ■ V = f, then 

n-V < f for all V eC. (21) 

In other words, it n ■ V > f, the point V cannot belong to C. 

In the specific case of a probability polytope like £, some facets are given by the 
equations P{a,b\x,y) = and P{a,b\x,y) = 1. Such facets are called trivial because 
they are not proper to C, indeed they describe the constraints < P{a,b\x,y) < 1 that 
any probability distribution must satisfy. There must be other facets, however, which 
capture the constraints proper to C. The inequalities \21\) associated to the non-trivial 
facets of C are the Bell inequalitieJ^ for the scenario under study. 

Before studying a specific example, we have to determine the minimal D such that 
£ C R^. Generically, a family Vx,y contains MaMb probability distributions, each 
of which is specified by mAms positive numbers constrained to sum to 1; so it can be 
fully specified by giving -Dtotai = MAMsirnAmB — 1) independent numbers. The Vx.y 
that can be reproduced by LV satisfy the additional no-signaling constraints, namely 
j:,P{a,b\x,y) = j:,Pia,b\x,y') = P{a\x) for all y,y' € and EaPia.b\x,y) = 

P{a, b\x' , y) — P{b\y) for all x, x' G X. The counting can be done as follows. One can 
first take the marginals as independent parameters: there are M^(m^ — 1) independent 
P{a\x), and MsimB — 1) independent P[b\y). Consider now any choice of {x,y): for 
every fixed 6 = 6, once the marginal P{a) is given, one is left with tua ~ 1 independent 
numbers P{a,b); similarly, for every fixed a — d, once the marginal P{b) is given, one 
is left with tub — 1 independent numbers P{d, b). All in all, a no-signaling Vx.y can be 
fully specified by giving 

Dm = MAMBimA - l)imB - I) + MA{mA - I) + MBiniB - I) (22) 

independent numbers. Since our goal is to compare LV with quantum physics, which is 
also no-signaling, it is sufficient to describe C as embedded in R^"^. 

2.4.2 A case study: CHSH 

It is instructive to work out explicitly the facets of C and derive the corresponding Bell 
inequalities. The simplest scenario has Ma — Mb = rnA ~ mB ~ "2. In this case, 
Dns = 8 and there are 16 extremal points: finding the facets is a very easy task for a 
computer, but still cumbersome to write down here. We are rather going to study a very 
meaningful sub-polytope of C. 

For a choice of settings {x,y) and binary outcomes, the correlation coefficient is 
defined by 

Exy = P{a^b\x,y)-P{a^b\x,y). (23) 

^^To set history straight, such inequalities were noticed pretty early in statistics: Boole certainly 
describes them. But in those classical days, neither he nor anyone else considered the possibility of 
their violation. By contrast, John Bell derived a single inequality, which is not even a facet but a lower 
dimensional hyperline on a facet, because he used additional constraints in the derivation. Nevertheless, 
he made the great step of pioneering the method in the study of quantum physics; so, at least in physics, 
inequalities of this type are generically named after him. 
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Any quadruple of numbers 

u = {Eoo,Eoi,Ew,En) (24) 

with — 1 < Exy < 1, is a priori a valid correlation vector. The sixteen vectors such that 
= 4, i.e. those vectors whose components are either +1 or —1, are extremal points 
of a polytope embedded in M^. 

To see which constraints are added by requiring that V (z C, it is convenient to use the 
labeling convention a,b E {—1,+!}. With this choice, Exy — {a-xby). In particular, for 

deterministic local points it holds E^y = a^by, which directly leads to EoqEqiEiqEh = 1. 
Therefore, the extremal points of the local correlation polytope are the eight vectors 

W2 = (+1, +1, -1, -1) , Ve = -V2 
W3 = ( + 1,-1, +1,-1) , VT = -V-i 
W4 = (+1, -1, -1, +1) , Vs = -V4 



= (-1,-1,-1,-1) 
= (-1,-1,+1,+1) 
= (-1,+1,-1,+1) 
= (-1,+1,+1,-1). 



Notice that {vi,V2,V3,Vi} are mutually orthogonal, so in particular they are linearly 
independent: this implies that R'* is the smallest embedding for the local correlation 
polytope. Now we want to characterize the facets of this polytope. Four linearly inde- 
pendent vectors are required to define a 3-dimensional hyperplane0: our task consist of 
listing all sets of four linearly independent extremal points, constructing the hyperplane 
that they generate, and checking if it is indeed a facet. 

The symmetry of the problem makes the task simple. There are sixteen sets of 
four linearly independent vectors, namely the Vs = {siVi, S2V2, S3V3, s^Vi} with s = 
[si, S2, S3, S4] G { — 1, +1}'*. The normal to the hyperplane generated by Vg is the solution 
to the equation ■ (sfcWfe) = /s = 4 (the constant being chosen for simplicity), which 
is readily found to be Ug — X]fc=i ^kVk, either by direct inspection or by noticing that 
Vi ■ Vj = A5ij. Moreover, the extremal points that do not define the plane are the four 
—SkVk, which all lie on the same side of the hyperplane since obviously Us ■ {—SkVk) = —4. 
Therefore each of the sixteen sets Vs_ defines a facet by the condition 



ris^ - u = 4 with rig = SkVk ■ (26) 

k=l 



To finish our study, we just have to inspect each of these facets (since = —rig, we 



^''This is because the origin (0,0,0,0) is inside the polytope. If the origin would be on a facet, 
only three linearly independent vectors would be sufficient to specify such a facet. This is a technical 
point that does not need to bother us here, but may play a role for most compact parametrizations of 
probability polytopes. 
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(27) 



can just look at eight of them). We find: 

^[+i,+i,+i,+i] = (4,0,0,0) AEoo < 4, 

= (2, 2, 2, -2) 2Eoo + 2Eoi + 2Ew - 2^n < 4 , 

'^l+i^+i^-i.+i] = (2, 2, -2, 2) ^ 2£;oo + 2Em - 2Ew + 2£;ii < 4 , 

'^l+i^-i.+i.+i] = (2, ~2, 2, 2) ^ 2Eoo - 2Em + 2Sio + 2^n < 4 , 

'^l+i^+i,-!.-!] = (0,4,0,0) 4£;oi<4, 

=(0,0,4,0) ^ 4£;io<4, 

= (0,0,0,4) ^ 4£;n<4, 

= (-2, 2, 2, 2) -2Eoo + 2Eoi + 2Ew + 2En < 4 . 

In summary, the local correlation polytopc for the simplest scenario has sixteen facets. Up 
to relabeling of the inputs and/or of the outcomes, eight of these describe the constraint 
Eqo < 1 and are therefore trivial, while the other eight describe the constraint 

S = -Boo + £-01 + Eio — Ell < 2 . (28) 

This constraint is not trivial, and indeed can be violated by valid correlation vectors 
which do not belong to the local polytope: in particular, the vector w ~ (+1, +1, +1, —1) 
reaches up to S = 4. The inequality ([28l) is called CHSH from the names of Clauser, 
Horne, Shimony and Holt, who derived it first in physics [Clauser, Horne, Shimony and Holt 1969) . 
It is the most studied of all Bell inequalities^ and we shall use it repeatedly in this text. 

Had we studied the full local polytope £, we would have found only eight more facets, 
describing the trivial constraints < P{a = 0\x) < 1 and < P{b = 0\y) < 1 on the 
marginals, which certainly cannot be captured from the correlations alone. In conclusion, 
CHSH is the only Bell inequality in the scenario Ma — Mb = niA — ms = 2. 



2.4.3 Detection loophole: faking it by denial of service 

We have adopted an operational approach, in which the violation of Bell inequalities 
is read as the impossibility of a fair simulation of the observed statistics. From this 
perspective, something that is normally (and rightly) considered as anecdotical in more 
physically-based presentations acquires a huge importance: the possibility of faking a 
violation of Bell inequalities by a clever denial of service. We can discuss it now with the 
example of CHSH. 

We have just seen that the correlation vector w = (+1, +1, +1, —1) reaches the max- 
imal possible value = 4. The LV vectors vi, V2, fs and —V4 differ from w only for 
one choice of settings: for instance, vi behaves like w as long as the pair {x,y) ~ (1, 1) 
is not chosen. Because of measurement independence, A cannot guarantee that (1,1) 
won't be chosen in that run. However, A can prevent such an event to be seen by adding 
the following instruction: for x = 1, Alice's box does not reply. Now, if Alice and Bob 

^^Bell's original inequality |Bell 1964] is ultimately CHSH, but he did not derive it using the systematic 
approach we just sketched. Rather, he had in mind measurements on a two-qubit singlet state, so he 
imposed that the LV model should satisfy ||18|I exactly. This lead to an expression that is valid only 
under that assumption; which is enough in order to prove that quantum theory violates the inequality, 
but is not suitable for comparison with experiments, since one will never observe absolutely perfect 
(anti)correlations. 
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naively estimate CHSH only from those instances in which both have got a reply, they 
may easily be cheated into obtaining S > 2 (and even S — A) whereas only LV were 
used. If the bureaucrat and his friend had been aware of this possibility and the student 
had not, the two men might have faked a successful simulation. This possibility is called 
detection loophole in the physics jargon. 

This proves that post-selection is not an allowed data processing when dealing with 
Bell inequalities on a purely operational basiJ^. The remedy to avoid the trap is clear: 
Alice and Bob must compute the statistics from the whole sample. When the boxes (in 
physics, the detectors) do not give any answer, Alice and Bob can adopt a priori either 
of two strategies. The first consists in treating the lack of answer as another outcome, 
in which case the scenario changes from (m^jWa) to (tua + I^tub + 1). The second 
consists in agreeing in advance that the lack of answer will be treated as one of the 
outcomes (say +1, always the same). Which processing is more efficient may depend on 
the scenario; but in both cases, the statistics are evaluated on all the runs: a violation 
of Bell inequalities by those statistics is conclusive. 

2.4.4 Forgetting memory 

At the very beginning of this study, I stated the temporary assumption that the black 
boxes behave in the same way in each run. There is no compelling reason for this to be 
the case: at run i, the boxes may well be taking into account the inputs and outputs of 
the previous i — 1 runs. However, now we have collected enough knowledge to understand 
that such memory effects cannot be used to fake a violation of Bell inequalities. 

Indeed, let us describe the most general protocol based on LV. We know already that 
the boxes can be assumed to act according to deterministic instructions Xi in each run 
i. So far, we had assumed that Xi is drawn independently in each run. Let us now drop 
this assumption and allow Xi to depend on all that happened in the previous i — I runs. 
Even now, however, in each run, each box is prepared with deterministic instructions. We 
could keep the reasoning general, but I find it more instructive here to refer explicitly 
to CHSH. Recall that E^y = a^by if we choose a, 6 G {—1, +1}. The maximum of S is 
reached if ao6o = +1, ao&i = +1, aibo = +1 and aibi — —1. A local deterministic Xi can 
satisfy at most three of these conditions, at the price of getting the fourth one completely 
wrong; whence the LV bound = 3 — 1 = 2. 

Now, the crucial observation is that, in a Bell test, Alice and Bob choose x and y 
independently in each run: in particular, the settings are completely uncorrelated from 
the A's, including the past ones. This means that, irrespective of how Xi is chosen, 
Alice and Bob may choose the "wrong" pair of settings (and the total probability with 
which this happens does not matter, since S is estimated from conditional probabilities). 

^^In the normal working of physics, which is the study of nature with characterized devices and without 
conspiracy theories, post-selection is licit as soon as one knows how the detector works and knows that 
the causes of reduced efficiency are completely independent from the choice of the measurement setting. 
In this sense, we can very safely claim that the violation of Bell inequalities has been observed (in fact, 
far more often than many other physical effects, that are the object of less close scrutiny because their 
consequences are far less deep). However, if it comes to characterizing untrusted black boxes, or to 
convince a skeptical, then post-selection is not allowed. 



The device-independent outlook on quantum physics 



19 



Therefore, memory effects cannot be used to fake a violation of Bell inequalities with LV, 
as claimecP^. 

2.4.5 The message in the violation 

We have accumulated all the notions required for studying Bell inequalities in the context 
of quantum physics — and indeed, it is worth while stressing very explicitly that no ele- 
ment of quantum theory has been used so far. This means in particular that the meaning 
of the violation of Bell inequalities is theory-independent. In today's scientific jargon, 
we say that the violations observed in laboratories are due to "quantum entanglement" . 
Maybe future scientists will use different concepts, but this won't change the fact. 

As for what the meaning of the violation actually is, rivers of ink have been spent 
in arguing on that, and even frequently-used terms like "nonlocality" and "violation of 
local realism" are the object of sometimes heated debatej^. I shall keep an economic 
approach: if the inequalities are violated, pre-established agreement is not the explanation. 
As a consequence, at least one of the assumptions that characterizes LV is falsified. We 
described these assumptions in paragraph l2.2.1l Let us now see what the denial of each 
assumption implies: 

• One can deny outcome independence, which, as we have seen, is equivalent to 
adopting a non-deterministic favorite explanation. If this option is chosen, the 
violation of Bell inequalities proves the existence of intrinsic randomness. Unless 
stated otherwise, the remainder of this text will be written from this perspective. 
This is what the mathematics of quantum theory suggest, since in general ^ 
cannot be written as (O ; it fits also the so-called orthodox interpretatiorF^. But I 
want to put the stress on two remarks: 

- The violation of Bell inequalities is the phenomenon that proves the existence 
of intrinsic randomness. In other words, at the risk of repeating what I wrote 
in paragraph l2.3l it's only because two degrees of freedom, even separated in 
space, violate Bell inequality that we can safely infer the presence of intrinsic 
randomness also when a single degree of freedom is involved, as in the double- 
slit experiment or in Heisenberg's uncertainty relations. 

^'^In this text, as mentioned, I concentrate on the asymptotic case of statistics gathered in infinitely 
many runs. In practice, it is essential to deal with finite-size statistics. It's this estimate that must be 
done more carefully when memory effects are taken into account: see section 2 of Gill 2012 for details 
and references. 

^*These terms can indeed be equivocal. I have met students who had memorized the slogan "the viola- 
tion of Bell inequalities demonstrates nonlocality" and had got a wrong understanding of it all. Similarly, 
hearing physicists claiming that "realism is denied" , many conclude that we can't know anything about 
reality — while everything started by accepting the reality of the observed statistics. I don't want to 
impose a ban on any of these terms and feel in fact free to use them myself whenever convenient, but 
they must be correctly understood. 

^^The supporters of the many- worlds interpretation may want to say that the violation of Bell inequal- 
ities proves the existence of intrinsic randomness in each universe. In this operationally-oriented text, 
I don't want to get lost in such subtleties and assume that each of us will experience only one universe 
throughout their lives, however many other universes may "really" exist out there. 
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- Intrinsic randomness per se cannot be a sufficient explanation for the violation 
of Bell inequalities: one can easily conceive a world with intrinsic randomness, 
in which correlations are nevertheless compatible with L\P°I. From this per- 
spective, a minimal sufScient explanation consists in postulating that intrin- 
sic randomness must be certifiable; but I prefer to try and recover quantum 
physics from physical principles such as those described in Section [6] 

• The second possibility is to attribute the violation to the exchange of a signal 
between the two distant locations. This is how one would violate Bell inequalities 
with classical resources. However, such a classical simulation could be prevented 
by putting enough distance between the box of Alice and that of Bob, so as to 
ensure that one's choice of input is spacelike separated from the other's output. 
Some authors have toyed with the idea of "peaceful coexistence" , arguing that one 
could think of a signal propagating faster than light, as long as it cannot be used 
by us to send an actual message. However, if one wants to reproduce all quantum 
predictions, this requirement can be met only by signals propagating at infinite 
speed (see Section [S]). This is exactly how the "quantum potential" of Bohmian 
mechanics behaves: indeed, as is well-known, Bohmian mechanics recovers the 
predictions of quantum mechanics with determinism, but at the price of signaling. 

• The last logical option is the denial of measurement independence, which means 
that the preparation of the system may vary as a function of the measurement that 
are going to be performed on it. I would argue that this choice is incompatible with 
the practice of sciencj^. Indeed, a fundamental tenet of the scientific method is the 
possibility of performing different measurements on identically prepared systems. 
This crucial procedure can be seen as a definition of measurement independence, 
and certainly can't make sense without it. 

• The reader may be astonished to find a fourth bullet here, while we listed only three 
assumptions above. It is dedicated to those who can't accept the violation of Bell 
inequalities in nature and struggle to restore the good classical explanation. Usually 
they scrutinize the mathematics, looking for the hidden assumption, the one that 
nobody would have noticed. Such attempts should be ignored and replaced with 
an operational challenge: skeptics should be asked to exhibit a violation of Bell 
inequalities between two classical computers, without communication and without 
post-selectiorl^. The effort of building such a simulation may prove very instructive 
— for them. 

After all, most physicists believed ardently in intrinsic randomness before Bell ended up certifying 
their belief. I thank Nicolas Gisin for bringing this point to my attention. 

^'^The denial of measurement independence is frequently called denial of free will. Indeed, if {x,y) are 
correlated with A, then for a given A some pairs {x,y) must be more probable than others. Now, the 
settings could in principle be chosen by human beings, whose "free will" would then be maimed. The 
argument is correct. However, when used, it tends to stir philosophical passions, typically driving the 
debate towards The Matrix, Libet's experiments, or moral responsibility — all very interesting topics, 
but whose connection with quantum randomness is (to adopt an optimistic stance) unclear. 

■^^In order to convince the last die-hard, the rules of the simulation must be stated with great accuracy: 
the painful task of defining how such a "Randi test" should look like has been undertaken by others 
|Gill 20121 [Vongehr 2012] . 
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2.4.6 Bell and spacelike separation 

I want to conclude by hopefully dissipating some frequent confusions. Let me first repeat 
that the violation of Bell inequalities per se proves that pre-established agreement is 
not the explanation, and nothing else. In the context of quantum physics (Bohmians 
aside), one would further seek to rule out the other classical explanation, signaling, 
thus positively proving the existence of intrinsic randomness. Spacelike separation is 
the ultimate way of guaranteeing no-signaling, which should convince everyone (again, 
Bohmians aside); but it is not a strict requirement for a Bell experiment. Let me go 
back to the parable: how will the student make sure that the bureaucrat and his friend 
won't communicate during the simulation? She will certainly put them in non-adjacent 
rooms, out of obvious visual and acoustic contact. Beyond that, she may just trust their 
integrity, perhaps performing checks at random times; or she may ask them to leave 
their handphone in her lab; or she may put them in Faraday cages, thus assuming that 
they can only use electromagnetic waves as signals... Only in an extremely adversarial 
scenario the student will enforce spacelike separation in the protocol. In summary: in 
Bell experiments, there is an operational asymmetry between ruling out pre-established 
agreement (which is what Bell inequalities can do) and ruling out communication (which 
is also of great importance but requires independent criteria). 

3 Bell inequalities and quantum physics: the very basics 

The content of this section can be summarized by one sentence: quantum theory predicts 
the violation of Bell's inequalities and experiments have confirmed this prediction. 

Any closer look will reveal a rather rigged landscape. In the past twenty years or so, 
a lot of Bell inequalities have been listed, some as belonging to nicely defined families, 
others as items of otherwise unstructured catalogues. Some of these inequalities lead 
to genuine refinements over CHSH, which the experts appreciate. Overall though, even 
these charted regions are largely unexplored and unexploited; and of course we can't know 
in advance if further exploration will lead to new insights, to a synthetic comprehension, 
or just to additions to the catalogu^. Anyway, the basics can be illustrated in the 
elementary scenario with two parties, each with two inputs and two outputs, leading to 
the CHSH inequality. I shall confine myself to it here and throughout all this text, unless 
stated otherwise. 

3.1 CHSH operator and Tsirelson bound 

As usual in physics, one can choose the most convenient labeling: notably, if we choose 
to label the outcomes a,b G {—1,-1-1}, the correlation coefficient defined in (|23l) be- 
comes simply Exy = {oxby). Here, we want to consider the case where the outcomes 

^^For instance, some authors have recently discovered a family of Bell inequalities that no quantum 
state can violate, a result that at first sight looks like as pointless a mathematical exercise as it gets; 
instead, these inequalities have become a tool to gain quite interesting insights on the structure of 
quantum statistics. A rule of thumb: as long as you find articles published in journals with generic 
scope, the field may still be thriving. If you see the birth of a "Journal of Bell inequalities" , you know 
that the end is approaching. 
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Ox and by are results of quantum measurements: there exist four Hermitians opera- 
tors Aq, Ai, Bq, Bi with eigenvalues —1 and +1, possibly degenerate, such that E^y = 

{Ax® By). 

In a Bell-CHSH experiment, each single-shot measurement corresponds to one of the 
Ax® By; the statistics of four such series of measurements are later combined in the form 



(f28l) . Within quantum theory, because of linearity, the same statistics can be seen as the 
average value of the CHSH operator 

S = Ao ® Bo + Ao ® Bi + Ai ® Bo - Ai ® Bi . (29) 

Even though a Bell experiment does not consist of single-shot measurements of S, the 
translation of Bell inequalities in the language of Bell operators is extremely useful^. 
In particular, a quantum state p violates CHSH if and only if there exist measurements 
such that Tr(pS') > 2. 

Before introducing states that actually violate CHSH, let us show a generic result, 
known as Tsirelson bound (Tsirelson 1980j : 

Theorem 3.1. Measurements on quantum systems can violate the CHSH inequality (|28p 
at most up to 

S < 2V2. (30) 

Proof. In order to prove the claim, we need to find an upper bound for the largest eigen- 
value of S, denoted as usual by H^Hoo- By construction, we have ||Aa;||oo = ||i3j,||oo = 1, 
A^ = and I3y = Id^; the dimensions dA and ds of the Hilbert spaces are left un- 
specified and may be infinite. The bound is very simple to obtain by working with the 
square of the CHSH operator 

5*2 = 41®1 - [Ao,Ai]®[Bo,Bi]. (31) 

Indeed, 

||[io,il]|U = -iliolloo < ||ioil|U + ||ilio||oo <2||iol|oo||il||oo = 2 

where for the last estimate we have used the Cauchy-Schwarz inequality \xy\ < \x\ \y\. 
The same estimate leads to ||[ijo, i3i]||oo < 2. Therefore ||S'^||oo < 8, which proves the 
claim. □ 



3.2 Study of CHSH for two-qubit states 

Historically, the first state for which a violation of Bell inequalities was noticed is the 
singlet state = :^(|0) "Si |1) — |1) <8) |0)). As mentioned above, for this state quantum 
theory predicts 

= -a-b (32) 

^''Here we started by some remarks on the labeling, but of course one is not obliged to find a par- 
ticularly clever labeling before writing down Bell operators: any Bell inequality is a linear combination 
of probabilities, therefore the corresponding operator is obtained by replacing each probablity with the 
corresponding projectors. 
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for Alice measuring along direction a, and Bob along direction b, in the Bloch sphere. 
With suitable choices of the measurements, these statistics can reach the Tsirelson bound 
S = 2\/2. Instead of indulging on this specific, very well known case, I provide directly 
a characterization of the behvaior of CHSH for a generic two-qubit state 



(8)l + fp-(T(g)l + l(8)Sp-CT+ ^ Tp'a^ (gxTj) . (33) 



Theorem 3.2. The maximal value of CHSH achievable with von Neumann measure- 
ments on a generic two-qubit state p (j33[) is 



(S) = 2v/Ai +A2 (34) 

where Ai and A2 are the two largest eigenvalues of the orthogonal matrix T^Tp, where T^ 
is the transpose of Tp . The proof contains the construction of a possible choice of settings 
that reaches this maximum \Horodecki, Horodecki and Horodecki 199^ . 

Proof. The quantity to be maximized is 

(S) = Tr{pS) = (ao,Tp^5^+jo) + (ai,Tp(^^) (35) 

=2cosxc =2sinxc^ 

where we used the fact that the sum and difference of two vectors are always orthogonal 
vectors, and where (60, ^1) = cos2x- 

Since ||ao|| = 1, the maximal value of (^(io,TpC) is ||rpc||, obtained when ag is chosen 
parallel to TpC. By the same reasoning on the term involving ai, we reach 

max^(5) = inax2(cosx||rpc|| 4-sinx||rpC-^||) 

Oo,Oi,f)o,6i b„,bi 

max2j||rpcl|2 + ||rpc^||2 (36) 

c,c-'- * 



where in the last step we used the well known optimization max^ cos + sin xy = 
\/x^ -\-y'^ achievable with the choice cosx = xj \J + . Finally, ||Tpc||2 = (j;^TpTpcj 
and TpTp is a positive orthogonal matrix. Therefore the maximization ([55]) is achieved 
by choosing c and c-^ as the two eigenvectors associated to the two largest eigenvalues. 
This proves ([M)) . The proof gives the recipes to reconstruct the measurement settings 
that lead to the maximal violation. 

□ 

Let us particularize this result to the case of a pure state: 

/ sin 26* \ 
|«'(6l)) =cos6l|0)(8)|0)+sin6i|l)(8)|l) — > T^{e) = [ sin26l (37) 

V 1/ 

The maximal achievable value of CHSH is 

max^ {^'{e)\S\'<i>{0)) = 2 + sin^ 20 , (38) 
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which is always larger than 2, unless sin 26* = 0. Therefore, all pure entangled states of 
two qubits violate CHSH for suitable measurements. Moreover, only maximally entangled 
states can reach S = 2^2. 

Let us determine the corresponding measurement settings. The eigenvector associated 
to the largest eigenvalue of T^Tp is c — z; the orthogonal subspace being degenerate, we 
can choose any vector in it as c^: for instance = x. With this choice, 

^0.1 = cosx^ ± sinx^ with cosx = 1/ Vl + sin'^ W . (39) 

Furthermore, oq must be the unit vector parallel to TpC, and a\ to TpC^ ^ so here 

flo = 5 , ai = X . (40) 

Now we know pretty much everything about the violation of CHSH by two-qubit states. 
In the next paragraph, we are going to see how CHSH can be adapted to prove that all 
pure entangled states violate a Bell inequality. 



3.3 All pure entangled states violate a Bell inequality ("Gisin's theorem") 

The fact that all pure entangled states violate a Bell inequality is usually referred to as 
Gisin's theorem, since Nicolas Gisin was the first to ask the question and to answer it 
for bipartite states |Gisin 199T] . Popescu and Rohrlich extended the proof to the general 
case shortly after [Popescu and Rohrlich 1992a) . Here, I follow this development. 

Lemma 3.1. Any bipartite pure state (i.e. of any dimensionality) violates a Bell in- 
equality. 

Proof. Any bipartite pure state can be written in its Schmidt decomposition |^) = 
X]fe=o '^k\k) <E) \k) where we can define the bases such that Cq > Ci > ... > c^-i > 0. The 
state is entangled if and only if ci 7^ 0. Let us now rewrite the state as 

l*> = ^c2 + c2(cos0|O)(»|O)+sin0|l)® |1)) + ^l-c2-c2|*') (41) 

where cos0 = co / a/cq + cf and j^"') is the normalized projection of j^*) onto the subspace 
orthogonal to Span(|0) (8) |0), |1) (g) |1)). Now consider the operators 

^ a.j;-a®l' , By ^ by ■ a ® 1' (42) 

where for both systems the Pauli matrices are defined as acting in the subspace Span(|0) , 1 1)) 
and 1' = |2)(2| + ... + |d — l)(c? — 1| is the identity on the orthogonal complement of that 
subspace. By choosing the measurement vectors that lead to ([55]) . one can therefore 
reach a value of CHSH 

S = {cl + cl) 2VI + sin 29 + {l-cl-cl)2 (43) 



which is larger than 2 as soon as ci > as claimed. 



□ 
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There is no claim that this procedure is optimal according to any figure of merit: 
after all, the construction does not make any use of the entanglement that was possibly 
present outside Span(|0) (8) |0), |1) |1)). One may want to look for "better" inequalities, 
even some that are tailored to the state. However, this approach is sufficient to prove 
the claim; moreover, it has the advantage that the procedure is defined uniquely, once 
and for all. This same remark will apply also to the general theorem, which can now be 
proved: 

Theorem 3.3. Any pure state violates a Bell inequality, and this can be checked by a 
uniquely defined protocol in which each of the parties performs local measurement^^. 

Proof. It is better to introduce the idea with an example. Suppose that Alice, Bob 
and Charlie share the GHZ state of three qubits --ij(|000) + |111)) and they have to 
perform local measurements, but they have only heard of the CHSH inequality. They 
can do the following: Charlie measures only if he finds +1, he has prepared the state 
1$+) = -^(|00) + |11)) for Alice and Bob; otherwise, he has prepared the state |$~). 
Alice and Bob, as for them, alternate between two measurement settings, those suitable 
to violate CHSH with 1$+). Clearly, the statistics P{a, b\x, y,z — ax,c — +1) will exhibit 
S = Now, it is not difficult to convince oneself that, if the P(a, 6, c|a;, y, z) can be 

described by a LV model, then all the P(a, b\x, y, z, c) can be described by a LV model 
too. Therefore a contrario, if the P{a,b\x,y, z,c) violate a Bell inequality for some choices 
of z and c (as in the example) , then the original three partite statistics P{a,b, c\x,y, z) 



In general, if a pure multipartite state is entangled, at least one pair can be prepared 
in a bipartite pure entangled state l^')^^ by the other parties Ci, C2, Cjv performing 
suitable measurements and obtaining the right outcomes. Given the state, therefore, the 

+ 2 parties agree that N of them perform only a single measurement, while the re- 
maining two parties choose suitable measurements for j^*) to violate some Bell inequality. 
Lemma l3. II guarantees that such measurements always exist. The violation implies that 
P(a, b, ci, cn\x, y, zi, zn) cannot be described by a LV model. □ 

3.4 Some mixed entangled states don't violate any Bell inequality: Werner 



It is pretty obvious that entanglement is necessary to violate Bell's inequalities in quan- 
tum theory and all separable states admit a LV model. It is very reasonable to conjecture 
that entanglement is also sufficient to violate Bell's inequalities — but this conjecture is 
wrong, as first proved by Werner [Werner T QSQ : 



^^Obviously, if the parties could come together and form two groups, one goes back to the case settled 
by Lemma [XT] 

^®In order for Alice and Bob to check the violation, at some point Charlie has to send {z, c) to them. 
This is not problematic, as can be seen in two scenarios: (i) If the information is sent before Alice and 
Bob choose their measurements, it can be seen as part of the procedure of state preparation, i.e. part of 
the pre-established agreement; (ii) If the information is sent at the end of the protocol, the post-selection 
done by Alice and Bob does not open the detection loophole, because it is uncorrelated with their choice 
of settings. 




states 
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Theorem 3.4. There exist mixed states that are entangled, but nevertheless cannot vi- 
olate any Bell inequality. 

The proof will be given by exhibiting explicit counterexamples. Consider the single- 
parameter family of two-qubit states ( "Werner states" ) defined as 

pvir = W\^-){-^-\ + (l-W)^ , W^e[0,l]. (44) 

Using the criterion of the negative partial transposition, it can be proved that Werner 
states are separable for W < ^ and entangled otherwise. The statistics of von Neumann 
measurements on Werner states are given by 

Vq{W) = |p(a,6|a,6) = i(l - l^afoa- 6), a,& e {-l,+l},a,6 e (45) 

Lemma 3.2. The set Vq{W) can be reproduced with LV ifW< \. 

Proof. It is clearly sufficient to exhibit the proof for T4^ = i, since any pw with W < ^ 
can be obtained by mixing pi with white noise. 

In each run, the pre-shared local variable is a vector A drawn from the unit sphere §^ 
with uniform distribution p[X)d\ = sin 9ddd(p with the usual spherical coordinates. 

Alice's box simulates the measurement of a single spin prepared in the direction A: 

P^ia\a) = ^(l + ad-x) . (46) 

Bob's box outputs b — — sign(6 • A), i.e. 6 = +1 if 6 • A < 0, 6 = — 1 if 6 • A > 0). So we 
have 

P{a,+l\d,b) = f dXp{X)P^{a\d)dj;r^, ^ ] + l-a f dXp{X)d-X. (47) 



JS^ - 4 Jb.x<0 

In order to compute the integral, we choose the spherical coordinates such that & is £ 
(i.e. = 0): 

dXp{X) a ■ X — — / d9 sinO / dip {ox cos if + ay sin ip) sin 6 + cos 6 
b \<o 47r Jo I 

1 f 1 

= - Oz d0sin9cos9 = — Oz . 

2 J7T/2 4 



Inserting this result into (j47|) and recalling that Oz = a ■ b, we recover indeed (|45l) for 
& = +1. The calculation for 6 = —1 changes only in the bounds of the last integral 
and yields the desired result too. This concludes the proof of the Lemma. Since all the 
Werner states with h < W < ^ are entangled, it proves Theorem 13.41 as well. □ 
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3.5 All that I left out 

The goal of this text is to present the device- independent outlook. In this perspective, the 
cases of violation of Bell inequalities in quantum physics constitute a bank of examples. I 
have kept this section short because we have collected sufficiently many examples to move 
on. But I encourage the reader to learn more examples by referring to the comprehensive 
review [Brunner et al. 2013j . 

In particular, the reader may want to learn about other Bell inequalities: with more 
inputs, more outputs, more parties. One will learn, first of all, that the computation of 
Tsirelson bounds and of violations for a family of states are not obvious tasks, contrary 
to what our specific Theorems 13.11 and 13.21 may suggest. Intriguing features emerge, 
which the simple CHSH does not possess: for instance, some inequalities are surprisingly 
maximally violated by non-maximally entangled states; other inequalities can be used as 
dimension witnesses, because the maximal violation cannot be achieved with (say) two 
qubits and requires in fact quantum systems of dimension larger than some bound. 

It is also important to realize that the ultimate test of the non-classicality of entangled 
states need not consist of the elementary procedure "take a state and measure it" . The 
Bell test may be the final step of more complicated procedures, involving for instance 
an initial filtering {hidden non-locality) or multi-partite preparation (activation of non- 
locality). 

4 Device-independent assessment 

4.1 A new vantage point 

When I entered the field back in the year 2000, Bell inequalities seemed to have already 
fulfilled their mission. Sure enough, a few passionate researchers were still distilling mod- 
erately interesting mathematical and physical insight from them. But most colleagues 
regarded them as one would regard, for instance, Foucault's pendulum: an instrument, 
which has allowed humankind to firmly establish a crucial fact about our physical world. 
Both the pendulum and the inequalities should feature in the science museums of the 
world, so that every curious human would be informed that the Earth rotates around 
its axis and that there exists intrinsic randomness — but research should move forward. 
For a few years, there was no strong argument against this stance: the few of us, who 
were still researching on Bell inequalities and related notions, looked like the lovers of 
Kodak films enjoying their last moments of fun at the dawn of the digital era. 

It all changed thanks to a complex chain of thoughts, stretching between 2005 and 
2007, which I narrate in Appendix IB] because I have chosen to order the materials in this 
text in a logical, rather than chronological, sequence. The final outcome can however be 
explained here because it was, in hindsight, a flash of the obvious. 

I stressed enough many times that Bell inequalities are independent of quantum theory 
— and so they must be, if they have to test quantum theory against the alternative 
description of pre-established agreement. The flash of the obvious is that one can be 
more specific; the assessment of a Bell test does not rely on the knowledge of the degrees 
of freedom that are measured. Notions like "photons" , "polarization" , "complementary 
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bases" , "dimension of the Hilbert space" , are completely dispensed with at the moment 
of analyzing a Bell test (though, needless to say, the experimentalists who build the test 
had better have a very good control of all that). In other words. Bell inequalities are 
the only entanglement witnesses that do not rely on assumptions about dimensionality 
of the state or commutation relations of the observables. 

This observation opens up the possibility of device-independent assessment of entan- 
glement and of related operational quantities like the secrecy of a cryptographic key, 
the amount of intrinsic randomness that is generated, etc. Besides their undisputed role 
in shaping the scientific worldview, Bell inequalities have a very unique role to play in 
future quantum technologies by providing the ultimate level of device certification. 



4.2 Self-testing 

Self-testing, which could also be called device-independent characterization of the state 
and the measurements, or simply blind tomography, refers to the fact that some statistics 
predicted by quantiim theory determine the state and the measurement as uniquely as 
possible, namely up to a local isometry. We first explain this equivalence class, then 
discuss explicit examples of self-testing. 



4.2.1 Equivalence up to a local isometry 

If the observed statistics P{a, b\x, y) are the only available data, the state and the mea- 
surements can be characterized at most up to local isometrics. Indeed, if nothing is 
assumed about the dimension, one obtains the same statistics 

P{a,b\x,y) = T,{pEl®El) = l^{~pEl<^El) (48) 

by appending other degrees of freedom according to p = p aA'B'C, and performing 
trivial measurements on those (£^^ = E'^ ® '^a'C and E^ = Ef ® Ib'c)- Further, if 
nothing is assumed about the measurements, one obtains the same statistics if states and 
measurements differ by local unitaries, i.e. p = Ua^ UbpU\ g) C/jj, E-^ ~ UaE^V\ ^^d 

eI = VbEIu\. 

Let us practice this notion on an example. Consider the state 

I*) = V ,j2fe,2fe) + |2fc + l,2fe + l)_ ^^^^ 
fc=o,i,... 

This is visibly a direct sum of singlets — since local unitaries arc free, here I shall 
call "singlet" any maximally entangled state of two qubits and shall rather use |i>+) = 
One expects this state to exhibit at least all the properties of the singlet. A 
local isometry helps to make this fact manifest. Let us append to l^')^^ a two-qubit 
local ancilla |00)^,^, and apply to both sides the local isometry $ = <^a ® with 

$A|2fc,0)^^, ^ |2fc,0)^^„ (50a) 
$A|2fc + 1, 0)^^, ^ \2k, 1)^^,, (50b) 



The device-independent outlook on quantum physics 



29 



and $B identically defined. We obtain 



AB 



100) 



A'B' 



colOO) +ci|22) 



Ck\2k,2k) 



AB 



|$+) 



A'B' 



(51) 



The local isometry has mapped the singlet into A'B', while AB carry the rest of the 
structure (in which there may be a lot of entanglement left). 
With these notions into place, we can study self-testing. 



4.2.2 Self-test of the singlet using CHSH 

The simplest self-test criterion to state is probably the following: 

Theorem 4.1. // a CHSH test yields S = 2^/2 exactly, then, up to local isometrics, 
the state is a singlet of two qubits and the measurements are the corresponding Pauli 
matrices. 

In other words, any state that leads to S = 2\/2 can be written as (j49|) . or as a 
mixture of such states out of which the singlet can be extracted with the same isometry. 



This theorem has been proved in various ways Popescu and Rohrlich 1992b Braunstein, Mann and Revzen 19t 
and ultimately the proof can be related to the one given below; so I skip it here. But two 
remarks are worth making. First, it is remarkable that self-testing can be achieved from 
a single number, associated to a procedure that uses only two measurements per site. 
Second, only extremal points of the set of quantum statistics can be self-tested exactly, 
but these points can never be achieved in a real experiment: therefore, self-testing calls 
for robustness bounds if it is ever to become useful. As it turns out, such bounds have 
been given; but I'll skip them, since they are not optimal and only add technicalities to 
the proofs. 



4.2.3 Self-test of the singlet using the Mayers- Yao statistics 

Consider two settings labeled {Xa: Za] on Alice's side and three settings labeled {Xb, Zb, Db} 
on Bob's side. All the measurements are binary and their outcomes are labeled ±1. The 
locality of the measurements [Ma, Nb] = is assumed; apart from this, nothing is known 
a priori about these measurements. In particular, the dimensionality is not known: this 
implies that any experiment can be described by the measurement of an unknown but 
pure bipartite state \'^). 

Suppose now that one observes that all the (^'iM^A^sl'I') are equal to the two- 
qubit values one would obtain for the singlet and the corresponding Pauli matrices 
($+|crm (g) cr„|<I>+), with the identification ad = '^{'^z + cTx)'- then \^) is equivalent to 
l^"*") up to local isometrics, and moreover all the operators are also equivalent to Pauli 
matrices acting on the effective qubit. This is a modification of the original Mayers- Yao 
scheme [Mayers and Yao 1998, 2004| , which used three settings also on Alice's side; the 
proof can be presented here thanks to the work of Matthew McKague, who found a way 
of rederiving the original result in simple terms. 
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Theorem 4.2. Consider five unknown operators {Xa, Za', Xb, Zb, Db} with binary out- 
comes labeled ±1 and assumed to fulfill [Ma, Nb] = 0; if 

{■9\ZaZb\^) = {^XaXbI"^) = 1 (52) 
{^\XaZb\^) = = (53) 

{^IZaDb]"^) = i^lXADBl'f) - 1/V2 (54) 

then there exist a local isometry $ — ^a such that 

m)ABmA'B' = ljunk)^5|$+)^,^,, (55) 
<i>MANB\^}ABmA'B' = |junk)^5(a,„®a„|<i>+)^,^,) . (56) 

Proof. The condition (j52p can be rewritten as 

ZaI^-) = andX^I*) (57) 

Inserting this into ([53]), we find {^{XaZaI"^) = 0, that is Xa\'9) is orthogonal to Z^l*). 
If this is the case, then (IMl) means that 



Db\^) = ^^^W: (58) 

indeed, Db | 4") must be of norm 1 , and we know already two of its projections of amplitude 
^ on two orthogonal vectors, so there can't be anything left. The last preparatory step 
consists in computing D^|^). One must be slightly careful here, because (|58l) tells us 
how Db behaves on j^*), not how it behaves on the different vector Db\'^)- Nevertheless, 
using [Ma,Nb] = we obtain 

= ^{Zl + Xl + ZAXA + XAZA)\'<f). 

But D% = Z'^ = Xl = 1, so finally 

ZaXaI^) = -XaZa\^) ^ ZbXb\^) = -XbZb\^). (59) 



The third setting D b was instrumental in deriving these anti-commutation relations and 
has now finished its role. 

The rest of the proof consists in exhibiting an explicit local isometry which leads 
to self-test of the singlet and of the corresponding measurements when ([57|l and ((59|l 
holcPI. The isometry is described in Fig. [2] a simple step-by-step calculation shows that 

■^^The proof of self-testing from CHSH can be made by showing that S = 2\/2 implies l|57|l and l|59|l 
for suitably defined operators; then continuing with the same steps as we are going to describe. 
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|o) 

MaNb\'9\ 



H 



Za 



Xa 



Zb 



|0> —H 



Xb 



H 



Fig. 2. Local isometry <1> that allows self-testing of the singlet and all the measurements of 
the Mayers- Yao test {M,N G {I,X,Z}). From top to bottom, the rows represent the systems 
A' (qubit), A (unconstrained), B (unconstrained) and B' (qubit). H is the Hadamard unitary 
gate defined as usual: H\G) = ^jdO) -I- |1)), H\l) = ^jdO) - |1)). The controlled gates act 
non-trivially on the target when the control qubit is in state |1). 



it implements the transformation 



AB 



100) 



A'B' 



[{t + ZA){t + ZB)mm 

[X^Xb(1-Za)(1-Zb)|*)]|11) 

[XB{t + ZA){t-ZB)mm 
[XA{t- ZA){t + ZB)mm . 



(60) 



First, using (j57p . we replace Zb with Za, and this cancels the third and fourth lines 
because (1 + ZA){t - Za) = 1 - = 0. Then, using one proves that XaXb(1 - 
ZA){t - Zb)\'^) = (1 + ZA){t + ZB)XAXB\'i'), which is in turn equal to (1 + Za)(]1 + 
Zb)\'^) because of (HZD. Finally (1 + Za)(1 + Zb)|*) = {1 + ^a)^|*> - 2(1 + Z^)!*), 
so we have found 



AB 



mA'B' 



li^^ + ZAm 



[|oo) + |ii)] 



1±Za 
V2 



AB 



1$+) 



A'B' 



= |junk)^ 



which is the self-testing of the state. 

The proof for the measurements follows the same steps, starting with <I> MaNb\^) ab\^^) A' B' 
instead of (l60l). Let me show it for one of the six cases: 



^XA^ASmA'B' 



[{1 + ZaK^ + ZB)XA\-^)]m 

[XAXBit - ZA){t - ZB)XAm |11> 

[XB{t + ZA){t - ZB)XAmm 



+ [XA{t - Za){\ + Zb)Xa\^)\ |10) 
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By using ([59l) . one moves Xa to the left in the first, second and fourth lines while 
changing the sign of Za, and Xb to the right in the third while changing the sign of Zb- 
The analysis is then the same as above, and the result is 

<i>XAmABmA'B' = ^(1 + ZAm [|01> + |10)] = |junk)^5 (a, ® 1 \'^+)a>b') ■ 

This concludes the proof, which was unpublished as such, but is a particular case of the 
robustness proofs presented in [McKague, Yang and Scarani 2012] , □ 

4.3 Intermezzo: two tools 

The main difficulty that one encounters in device-independent studies is the fact that 
there are a priori infinitely many free parameters, since the dimension of the Hilbert 
space is arbitrary. A reduction to a finite problem seems to be necessary if one wants 
to compute explicit bounds. In self-testing, for instance, the isometry takes care of 
relegating all the potential infinity into the junk — by the way, in the text I have by- 
passed the main difficulty faced in the research work by providing the isometry, instead 
of leaving it to be found by the reader. Before moving to the next example of device- 
independent assessment, we need to introduce two of the main other tools developed so 
far. They are not on equal footing: the first one is very specific to the CHSH case, the 
second is far more general. 

4.3.1 A very specific tool: decomposition of CHSH 

We start with Jordan's lemma: 

Lemma 4.1. Let Aq and Ai be two Hermitian operators with eigenvalues —1 and +1. 
There exist a basis in which both operators are block- diagonal, in blocks of dimension 
2 X 2 at most. 

Proof. By definition, Aq = = 1 in the suitable (unknown) dimension. It is then 
trivial to check that U ~ A^Ai is unitary. Let us denote |a, 0) an eigenstate of U: 
U\a,0) = u}a\a,0) with \uja\ — 1- Then |a, 1) = 4o|ajO) is also an eigenstate of U: 
indeed, U\a,l) =ioiiio|a,0) ^ AoW\a,0) =tj*|a,l). Therefore: 

in|a,0) = and io|a, 1) = 1", 0) , (61) 

Ai\a,l} = U^\a,0) =uj*Ja,0) and ii|a, 0) = ii(a;„ii|a, 1)) = 1) . (62) 

Since the eigenvectors of a unitary operator span the whole Hilbert space, we have 

io=0< and ii = 0Rc(a;„X -Im(w„)CT;^ (63) 

a a 

where the are the usual Pauli matrices defined in Span{|Q!, 0), |a, 1)} with the con- 
ventional choice a" = |a, 0)(a,0| — \a, l)(a, 1|. □ 
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As a direct consequence of this Lemma applied to both Ahce's and Bob's boxes, there 
exist a basis in which the CHSH operator can be decomposed as 

S - 00^"'' (64) 

a 13 

where each of the S°'^ is a two-qubit CHSH operator. In other words, any observed 
violation of CHSH can be written as 

Sobs = Tr(p5) = ^Tr(p"^^"^) (65) 

a, 13 

where p°^^ is the unnormalized two-qubit state whose matrix elements are 

P2i+]+i,2k+i+i = > e {Q,l} ■ 

One can therefore hope to reduce a device-independent study of CHSH to a two-qubit 
problem. This reduction does not follow automatically from the decompositiorF^. but it is 
indeed quite often possible: we are going to see an example with randomness amplification 
below. 



4.3.2 A very general tool: characterization of the quantum set 

At the beginning of Section [21 I reminded that the observed statistics Vx.y can be 
obtained from measurement on a quantum state if there exist a state and suitable POVMs 
such that P(a, y) = Tr(p_E|^ ® E^). In this case, one says that Vx,y belongs to the 
quantum set Q for the scenario {X,A;y,B). The dimension of the Hilbert spaces is left 
arbitrary. Therefore, assessing whether Vx.y G Q is a possible device-independent test. 
As such, this test won't be often performed on observed data, insofar as nobody expects to 
observe statistics in flagrant violation of quantum physics. However, the characterization 
of the quantum set is a crucial tool for more relevant device-independent tests: all those, 
in which one aims at optimizing some value over all possible quantum realizations. 

But how to characterize the quantum set? The question looks similar to the charac- 
terization of LV models, but the mathematical answer is very different. This is due to 
the fact that the quantum set is convex but has continuously many extremal points!^. 

There is still quite some complexity left in l|64p : for instance, one cannot optimize each 5"'' in- 
dependently, since the Pauli matrices of Alice (Bob) are the same in all operators with the same a 
(0). Such constraints are notably the reason why the conjectured maximal violation of CHSH by higher 
dimensional pure states is still unproven (see [Liang and Doherty 2006] for the most thorough numerical 
verification). Another example, in which the reduction does not seem to be possible, is the study of 
robust self-testing (recall that this is the study of how the self-testing conclusions have to be modified 
when the observed correlations differ from the ideal ones). 

^®The proof of convexity is rather simple: since the dimension is not fixed, given two quantum points 
Pi and P2, one can always think of the underlying states and measurements to live in disjoint direct 
sums. Then, the point pPi -|- (1 — p)P2 is obtained by measuring the obvious mixed state with the 
suitably extended measurements. The fact that there are continuously many extremal points is proved 
by exhibiting an explicit family of such points: I do not think this proof instructive enough as to be 
reported here. 



34 



V. Scarani 



Its boundaries are not hyperplanes, like the facets of a polytope. The tool we are going 
to discuss allows to approach the boundaries of the quantum set from outside. 
Let us start with the following observation: 

Lemma 4.2. Let {_Fi, _F!„} be a collection of operators. The orthogonal matrix M 
whose entries are 



is non-negative. 

Proof. The proof is very direct: for any vector it G 



(66) 



it holds 



> 



because both p and any operator of the form C^C are positive. 



□ 



Therefore, M > for all AI constructed as above is a necessary condition for Vx.y G 
Q. The idea is therefore to construct matrices of this type, whose entries can be expressed 
in terms of the observed statistics Vx,y, and check if they are not negative. 

At this point, it is useful to break the chain of general reasoning and consider the 
same example of paragraph 12.4.21 the correlators in the CHSH scenario. The observed 
data are {Eqo, Eqi, Eiq, Eh). In the quantum formalism, 



Exy — Tr 



p{K=+i-K=-i)<»i^l=+i-^l=-i) 



(67) 



We want to construct M that can accommodate the observed data. Let us define 



El 






a 


E2 


= [Kl 


\i - K=\] « 


a 


Es 






1 


Ea 


= 


npii - npi 


1 



notice that = 1 because IIJII^, = f^a.a'HjJ and the analog for Bob. Then we have 



Mn = 



( 1 MpE.E^) TripE^E:,) Ty{pE^Ea) \ 
1 Tr(pF2F3) Tr(pF2F4) 

1 Tr(pF3i^4) 
1 



(68) 



(since the matrix is symmetric, for clarity of reading I fill only the upper half). The 
observed data fit in this matrix as 



M = 



( I Ui Eqo 

1 Elo 
1 

V 



Eqi 
Ell 

1 / 



(69) 
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Two entries are undetermined, since we can't have observed data for Tr(pi^ii*2) and 
Tr(/3F3F4): these numbers would require one of the parties to perform two measurements 
in the same run, which is against the operational rules of the black boxes. Nevertheless, 
if {Eqo, Eqi, EiQ, Ell) € Q, there must exist two real numbers mi,M2 € [—1,+!] such 
that M > 0. The calculation of the conditions under which this happens is available in 
the literature and not particularly instructive, but the result is remarkable: one can find 
Ml, U2 such that ill > if and only if 

\Aoa + Aoi + Aw - An | < tt (70) 

with Axy = ATCsin{Exy). This non-linear equation approximates the boundary of the 
quantum set for correlators and has at times been called a "quantum Bell inequality" . 
An interesting case study are the statistics ■^'~"^)- It is the only quantum 

point that reaches S — 2^/2 and it saturates this inequality as well. On the one hand, 
this example shows that even elementary tests may enforce decent boundaries. On the 
other hand, we know (from self-testing) that a quantum point reaching S = 2^/2 must 
also have the marginals of the singlet, namely P{a\x) = P{b\y) — ^: it is then simple to 
construct valid 'Px,y with the same correlators but biased marginals, which are therefore 
not quantum but cannot be detected by this test. 

Going back to the general discussion, we left the problem as unparametrizable as 
before: Lemma does not constrain the family of operators for which one should check 
M > 0; and even if all operators are checked, we have nothing more than a necessary 
condition for Vx,y G Q. Fortunately, it has been proved by Navascues, Pironio and Aci'n 
(NPA) that there exist a convergent hierarchy of criteria [Navascues, Pironio and Acm 2007^ . 
The simplest test in the hierarchy, which is however already tighter than our example 
above, takes as F's the identity 1 and all the measurement operators E^ (S) t = 11^ and 
a (g) = n^. The set of the Vxy for which M > in this step is denoted Qi. The 
further steps of the hierarchy add to the list of F's all the products or two (e.g. U^U^,, 
n^n^), three (e.g. nj^IIj^n^, ), etc. measurement operators. The set of the Px,y for 
which M > at each stage are denoted Q2, Qs etc. Clearly Qi ^ Q2 ^ Q3 ^ ■■■ and 
Qn 2 Q for all n. What is not trivial to prove is that this hierarchy is convergent, i.e. 
lim„_j.oo Qn = Q- As such, the hierarchy of tests provides a necessary and sufficient 
condition for Vx,y G Q- 

All the steps of the hierarchy involve writing down a matrix with undefined elements 
and asking whether one can fill the gaps in such a way that it becomes non-negative. 
Such a computational problem is an example of a very well-known class known as semi- 
definite programs, on which the reader will find abundance of information upon searching. 
Let me just stress here that semi-definite programs are not only efficiently solvable: the 
solutions are also exact up to numerical precision, because the result can be upper- and 
lower-bounded simultaneously. Needless to say, efficient as the algorithms may be, the 
size of M grows very fast, so it is impossible to check the hierarchy up to arbitrary order. 
In practice, one compares the result after the first few steps with a known quantum result 
guessed to be optimal, and they often coincide, like the Tsirelson bound for CHSH in 
the example above. 

A final remark: the hierarchy is elegantly defined and widely used, but one should 
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keep in mind that Lemma [4.21 allows for full freedom in constructing a set of tests. In 
particular, there is no need to go all the way from Q„ to Qn+i in order to strenghten 
the constraint: for example, it is enough to add a single IljJII^ to the F^s that define Qi 
to obtain a test which is a priori stronger than the latter. 



4.4 Randomness amplification 

4.4.1 Prom science to devices 

In paragraph l2.4.5I I explained that the violation of Bell inequalities proves the existence 
of intrinsic randomness (unless one opts for a deterministic explanation that requires 
superluminal signaling). This suggests that Bell inequalities may be useful to generate 
random numbers for practical applications. This possibility must be argued further. 

The key issue is measurement independence. In paragraph 12.4. 5[ I argued that the 
possibility of measurement independence cannot be denied without denying a great part 
of the scientific method. Of course, this does not mean that measurement independence 
must be believed blindfoldedly for any Bell experiment: one can legitimately try and 
check that that source and those choice of settings are really independent. This check 
cannot be done in a device-independent way, it always leaves an element of trust. Usually, 
what one does is to implement the choice of the settings by "random number generators" 
built by the trusted parties themselves. At first sight, this seems to render the generation 
of randomness through a Bell test a pointless task: if one has already got a random 
number generator, why bother using it as seed of a Bell test to create other random 
numbers, instead of using directly? Even further, many devices acting as random number 
generators are already produced and routinely used. So, is there an additional benefit in 
generating randomness using a Bell test? The answer is positive: the Bell-based protocol 
produces more, and above all better, randomness. 

Indeed, notice that the choice of the settings needs not be "random", but just un- 
correlated from the source while the Bell test is running. If there are enough reasons to 
trust this to be true, then any source of weak, or even pseudo, randomness will dcF^. 
But the product is very different: if a Bell inequality is violated, the outcomes of the Bell 
test is guaranteed in a device-independent way and private, since nobody can have an 
exact copy of Alice's list. This is nothing but a rephrasing of what the violation of Bell 
means: the outcomes could not possibly pre-exist, so they are guaranteed to be random 
and nobody else can possibly have a copy, otherwise the list would have been pre-existing 
after all. Furthermore, the settings for a run need to be kept confidential only until the 
outcomes of that run are produced. 



^"For instance, Alice may take her second favorite book in its third French edition and select the 
settings based on the sixth letter of each even line. This is a perfectly deterministic recipe, but Alice 
may have good reasons to trust that the provider of the boxes will not guess it. 
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4.4.2 Randomness amplification using CHSH 

Let us consider the CHSH test: if Sots > 2, there is some randomness in Vx.y- For 
simplicity, we focus on Alice's outcomes only, so we want to computcF^ 

P*{a\x) = maxP(a|x) subject to CHSH[-P;t,3;] = Sobs &ndVx,y G Q- (71) 

This is the most elementary example discussed in [Pironio et al. 2010] . We expect 
P*[a\x) — 1 for Sobs < 2 and P*{a\x) = \ for Sobs = 2-\/2 because we know that 
this condition self-tests the singlet. 

The first constraint is linear, therefore easily dealt with. As we know from paragraph 
14.3.21 the other constraint can be approximated by semi-definite criteria. In a systematic 
approach, therefore, one would start by replacing Q with Qi, solve the relaxed opti- 
mization with semi-definite programming, then check if one finds a quantum state and 
measurement that reach the same result. If this is the case, one has the solution. If 
not, one can iterate down the hierarchy. This could be a very instructive exercise for the 
reader. Here, we can solve the optimization (|7ip analytically using the decomposition of 
CHSH presented in paragraph 14.3. II 

Let us start with the observation that, since we do not bound the dimension of the 
Hilbert space, the quantum state can be considered pure without loss of generalit\F^. Let 
us write it in the basis which decomposes Alice's and Bob's operators as in paragraph 
14.3.11 l^*) — X]q p y/Pap\'4'°'^) where is the suitable normalized two-qubit state. 

Therefore, 

Sobs = (72) 

Similarly, any P{a\x) compatible with the constraints will be of the form 

p{a\x) = ^p„;3(V'"''|nr®a''IV'"0. (73) 

The problem has a standard form: maximize P — "YlikPkPk under the constraint S = 
'^kPk^k = Sobs- In order to find the solution, we can first find the maximization for 
the elementary problem max5^,=s P). = f{s). In the very reasonable case that /"(s) does 
not change sign, we have two possibilities: (i) if / is convex, the solution to the main 
problem is simply P — f{Sobs) obtained by setting all the Sk — Sobs', (ii) if / is concave, 
the solution to the main problem is given by the convex combination of boundary points 
(for our problem, it would be P = p^ + {l—p)l withp defined by Sobs = p2V2 + (1 — p)2). 

So, our next step is to find the maximal value of ('0111^ (8i over the set of pure 
two-qubit states, under the constraint that CHSH — S. From (|38|). we know that a state 
with Schmidt decomposition lip) — cos 6*100) + sin^jll) can reach s — 2\/l -\- sin^ 29. 
By writing down the projector for that state, we see that its most biased marginal is 

■^^Even if it is quite obvious, let me stress that the intrinsic randomness of the Bell test is not character- 
ized by the observed P(a\x) from 'Px,y- indeed, one can easily define LV distributions with P{a\x) = ^, 
for instance white noise. 

^^Moreover, it is intuitive that a pure state will give the optimal solution: in a mixed state, one adds 
classical randomness on top of the quantum randomness. 
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P(+|z) = i(l + COS 20). States with smaller values of 9 can also reach the same s (for 
suboptimal measurements), but their most biased marginal is definitely lower; states 
with higher values of 9 cannot reach s. Therefore, by solving explictly for 9, we find after 
trivial algebra /(s) = ^(1 + \/2 — (s/2)2). This function being convex, we have found 
the solution to the general problem: 



P*{a\x) ^ - 

It is worth while noticing that the bound can be achieved by the observed P{a\x) under 
ideal conditions. Indeed, as discussed in paragraph 13. 2[ the maximal violation of CHSH 
for the two-qubit state \'^{9)) can be achieved by choosing z as one of Alice's measure- 
ments. If one creates exactly that state and performs exactly the optimal measurements, 
the observed P{a\x) will be related to Sobs through (TM)) . 

4.5 Subtle is the device, and maybe malicious 

Device-independent assessment works because one cannot violate Bell inequalities by 
accident. Of course, since a Bell test is a statistical test, a violation may happen as a 
fluctuation, but the probability of such an event can be quantified exactly and this is not 
the point I want to makj^. Suppose rather that one of the procedures for the Bell test 
is implemented incorrectly: for instance, the synchronization procedure may fail, so that 
Alice and Bob compare results that correspond to different pairs. Or suppose that there 
is a failure in Bob's hardware, in such a way that the choice of the input (the position 
of the knob) does not change the measurement that is really doncF^. In either case, no 
violation will be observed: the assessment will produce the conservative answer that the 
device does not reach up to standards. 

In fact, the conclusion of an allegedly device-independent assessment can be thwarted 
only if (i) there are loopholes in the Bell test or (ii) there is a failure related to the task 
itself to be assessed. These are the points that one has to check. 

We know already how to deal with point (i) , but it is useful to revisit those items in 
this new light: 

• The settings must be chosen by the users independently from the devices, so the 
users should trust the random number generators that make those choices (ideally 
by fabricating them themselves). 

• Cheating boxes could try and exploit the detection loophole. By requesting that 
the boxes give an outcome each time that a setting is chosen, this cheat can be 

Recall that, in this text, I have always assumed that the statistical character of quantum measure- 
ment is dealt with in a proper way. 

^''As a side remark, notice how such a hardware failure could easily lead to observe a violation of the 
uncertainty relations. Indeed, if Bob believes that he is alternating measurements of X and P, while in 
fact the same observable is being measured all the time, then the observed variances will be equal and 
can be arbitrarily small. 



1 + 



Sob 



(74) 



The device-independent outlook on quantum physics 



39 



ruled out easiljl^l. 



• One should exclude signaling between the measurement devices, as we discussed 
at the end of paragraph l2.4.5l 

• Deviation from the i.i.d. scenario can be treated as the memory effects in para- 
graph [2A4] provided that the scenario is not adversarial, i.e. if device-independent 
assessment is made with the purpose of quantifyin g de fects. If the scenario is ad- 
versarial, however, the tools are just being developeqfj and exceed the scope of this 
text. 

Point (ii) arises because we are not aiming at describing the violation itself, but at 
accomplishing a task with uncharacterized devices. For instance, consider the ampli- 
fication of randomness, and assume that the Bell test is done by proper measurement 
of quantum entanglement, without any communication. Nevertheless, if a measurement 
box contains a signaling device, it can just leak out the list of random numbers at the 
end of the process. The random list is still guaranteed and fresh, but no longer private. 
Clearly, spacelike separation of the choices during the Bell test does not protect against 
such classical leakage. It is crucial to stress, however, that this is not a limitation of the 
device-independent assessment: whenever there is a claim of privacy, one must check at 
best, and ultimately trust, that the devices are not unduly leaking out information. 



The LV model for quantum correlations is definitely falsified by the violation of Bell 
inequalities. Logically, this does not force one to accept the quantum description of 
statistics ([3]) as the only alternative: all that this means is that some P{a, b\x, y, A) in ^ 
must violate Bell inequalities. In particular, one may still want to try and describe the 
observed statistics while recovering at least some of the classical features that quantum 
theory denies. In this section, I am going to review further results that show the robust- 
ness of quantum knowledge: sheer observations, which quantum theory reproduces with 
simple finite- dimensional calculations, are sufficient to falsify many apparently reasonable 
alternative models. 



For definiteness, let us focus on the statistics predicted by quantum theory for measure- 
ment on a singlet state: 



^°The only nuisance, not a minor one at the moment of writing, is that very few experimental setups 
can exhibit a violation of Bell inequalities while closing the detection loophole. As a result, device- 
independent assessment is currently challenging, if not strictly unfeasible. But there is no reason to 
believe that technology won't improve in the coming years. 

^®For instance, see [Vazirani and Vidick 20TT| for the certification of Bell-based randomness against a 
quantum adversary. 



5 The robustness of quantum knowledge 



5.1 Two no-signaling relsLxations of LV 




(75) 
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As we know by now, they can't be reproduced with LV; moreover, if one accepts the 
vaHdity of quantum theory, a small finite subset of them is sufficient for self-testing. Here 
we are going to show two much more stringent results. They are theory-independent like 
Bell's original theorem; and device-independent a fortiori. Before discussing them, I 
need to introduce another useful inequality, because the CHSH inequality, with its two 
settings and its Tsirelson bound, is not powerful enough to reach these conclusions. 

5.1.1 A tool: the chained inequality 

The chained inequality is a bipartite inequality with Ma = Mb = M settings and two 
outputs on each side, based on the following sum of 2M terms 

Cm = P{ai = bi) + P(6i = aa) + P(a2 = ^'2) + ... 

... + P{aM = bM) + P[bM^ai) (76) 

where P{ax, by) is a convenient shorthand for P{a, b\x, y) and P{aj = bk) = k)+ 
P(-l,-l|j,fc), P{a.j ^ bk) = P{+\,-\\j,k) + p[-l,+l\j,k). The assumption of LV 
enforces the bound Cm < Cm,l = 2Af — 1, while the algebraic bound (achievable with 
no-signaling distributions) is obviously Cm.ns = 2M. As a kind of exercise, let me 
mention two alternative ways of writing the same inequality, which may be useful: 

• Using Ejk = 2P(aj = bk) — 1 = 1 — 2P{aj 7^ bk), one can rewrite everything with 
correlators: 

C'm — Ell + E21 + E22 + E32 + ■■■ + Emm — EiM (77) 

with C'J^J ^ — 2{M — 1) and C^,^ — 2M. From this expression, it is manifest that 
the chained inequality for M = 2 is equivalent to CHSH. 

• For paragraph l5.1.3( it is useful to invert all the terms of the equation using P{aj = 
bk) + P{aj j^bk) and get 

C';i = 2M-Cm = P{ai ^ bi) + Pibi ^ a2) + P{a2 ^ b2) + ... 

... + P{aM^bM) + P{bM = ai). (78) 

The local constraint reads C'l^ > C'lj ^ — I, the algebraic bound is C'^j j^g = 0. 

The chained inequality has many properties that one may consider as sub-optimal: it 
needs settings but only uses 2M correlatortlf^, only one of which expresses a condition 
that is incompatible with the others under LV. Also, for any M > 2, the LV bound docs 
not define a facet of the local polvtopj^. which means that there exist tighter inequalities 
for each M. 

'''^This is not at all a problem for the theorist; but for the experimentalist, it means that most of the 
time one is recording data that won't actually be used. 

^*The proof is simple. Using l|22|l . we know that the polytope is embedded in a space of dimension 
D]\fg{M) = + 2M. Facets are hyperplanes of dimension Djvs — 1, so there must be -Djvs linearly 
independent points lying on each facet. In particular, there must be at least Djvs extremal points on each 
facet. However, it is easy to check that only 4M extremal points saturate the chained inequality. Indeed, 
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The main interest of the family of chained inequalities is that the quantum violation 
comes arbitrarily close to the algebraic one in the limit M — )> oo. To prove this claim, it is 
enough to produce an example (which turns out to be the maximal violation achievable 
with quantum physics for each fixed M) : consider a two-qubit maximally entangled state 
|$+) and the settings chosen taken in the x — z plane of the Bloch sphere as 

dj — cos92j-iz + sm92j-ix , bj — cos92jZ + sm92jX (79) 

where 9k — j^- With this choice, all the probabilities in (|76)) become equal to i(l + 
cos jj^ ) — cos^ and consequently 

This is the property that we are going to use to demonstrate the following two results. 



5.1.2 The singlet statistics have zero local fraction 

^"^^ con- 



The first idea [Elitzur, Popescu and Rohrlich 1992 Barrett, Kent and Pironio 2006 



sists in writing every P in Vx,y as the convex sum 

Pia,b\x,y) = p J dXpi\)P{a\x,\)P{b\y,\) + {l-p)Pia,b\x,y,fi) 

= pPLia,b\x,y) + il-p)PNsia,b\x,y) (81) 

where p € [0, 1], Pl can be achieved with LV and Pns = ^^1^^^ is only requested to be 
a valid probability distribution (it is no-signaling by construction, since both P and Pl 
are). The local fraction p^ of Vx.y is defined as 

PL — max p. (82) 

TO holds 

The local fraction is a natural figure of merit in terms of simulations: if one wants 
to simulate the observed correlations P, the local part Pl comes for free. Clearly Vx,y 
violates at least one Bell inequality if and only MpL < 1; moreover, the observed violation 
of a Bell inequality puts an upper bound onpL- Indeed, let II, lobs and /aig be the local, 
observed, and algebraic bounds for the inequality under study. From (jSTjl it follows 
immediately that lobs < plh + (1 — P)^aig, whence 

PL < ^f' ~ ■ (83) 

Jalg — 

If lobs > ^L, the bound on pL is not trivial. In particular, if Vxy leads to lobs — ^aig, 
then PL = 0. This holds for the singlet statistics (I75|) and the chained inequality in the 
limit M — cx), so we have proved 



take II76I I: Cm = 2M — 1 can be reached either by ai = bi = 02 = ... = [in which case P(fejv/ 7^ ^i) = 
and all the other probabilities are 1] or by a point of the type ai = 61 = ... = ^ bj, = ajj+i = ... = a^,/ 
[in which case P(aj. = bj.) = and all the other probabilities are 1; there are clearly 2M — 1 positions for 
the ^ sign]. There are certainly no other points, so, given that a\ can take two values, we have found 
that AM extremal point saturate the chained inequality as claimed. Since AM < Dfjs{M) for M > 2, 
the chained inequality cannot be a facet. 
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Theorem 5.1. The singlet statistics ()75p . and even the subset obtained by restricting a 
and b to lie in a plane, have zero local fraction. 

5.1.3 The singlet statistics force fully random marginals 

The second result is the achievement of a series of works initiated by Leggett. Let us 
recall the result of self-testing: if one assumes quantum physics to be valid, some observed 
Vxjy can only be due to the measurement of a maximally entangled state. Within 
quantum theory this implies that the properties of the composite system are sharply 
defined while those of each sub-system are completely undefined. One may wonder 
whether a more classical picture can be recovered at the level of probabilities and try 
to reconstruct a given Vx,y using Vxy^x that have biased marginals. As an example, 
Leggett studied whether one can find a decomposition ^ of the singlet statistics ([75]) . 
whose marginals would look like those of pure single-qubit states P{a\d, X) — + a) 
and P{b\b, X) ~ ^{1 + v\ ■ b). Generically, if the Vx.y,\ are allowed to be signaling, they 
can even be completely deterministic. However, if the Vx,y,\ are further restricted to be 
no-signaling (as in Leggett's example), very strong constraints can be proved, notably 

Theorem 5.2. Consider any decomposition ^ of the singlet statistics (j75p ; if the 
P{a, b\d, b, A) are reguested to satisfy the no-signaling constraint, then they must have 
fully random marginals, i.e. P[a\d,X) — P[b\b,\) — ^ \Colbeck and Renner 2008^ . 

Proof. The proof uses the statistical distance between two conditional probability distri- 
butions 

D{Pui^,Pviu.) - ^ E iPi^H-PivH- (84) 

u£U,vev 

This distance is obviously symmetric by exchange of U and V; it satisfies the triangle 
inequality as well as D{Pm^,Pv\uj) < P{u ^ v\uj). 

Let us start from the chained inequality in its form (|78|) applied to one of the Vx.y,x = 
Vx. Using the bound just mentioned, we find 

C'lAVx) > DiP_AlnX,PB\llx) + D{Pe\2lX,PA\2lx) + D{Pj^\22X,Pb\22x) + - 

... + D{Pj\^\mmx-,Pb\mmx) + D{Pb\imx, 1 ~ Pa\imx) 

with obvious notations. Now we use the no-signaling property PA\xyX — Pa\xX and 
PB\xyX = PB\y\- Now we Can apply the triangle inequality as 

D{PA\x=lX;PB\y=l\) + D{PB\y=l\TPA\x=2x) > P>{Pa\x=IXt Pa\x=2x) 

and by repeated application we finally reach 
C'iiiVx) > D{P^\^=^xA - Pa\x=ix) 

1* |P(a = 0|2; = l,A)-i| + |P(a = l|a; = l,A)-i|. (85) 

Now, the chained inequality is based on a linear expression, therefore ^ implies C'lj{'P) — 
j dXp{X)C'li{Vx)- If P are the statistics ([75)1 of the singlet, in theory we can have 
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C'^CP) = 0. Since this value is the no-signahng bound, it imphes C'^{V\) = for ah 
A, which, inserted in (|85|). proves the theorem for P{a\x — 1, A). The proof for ah other 
settings is exactly the same: one just has to keep the suitable terms when iterating the 
triangle inequality. □ 



5.2 Signaling models: simulation v^rith communication 

In the whole text, I adopted the view that the violation of Bell inequalities demonstrates 
intrinsic randomness. It matches how most physicists understand quantum physics and, 
as we have seen, it can even be related to potentially useful applications. This approach, 
however, does not address the reason why many (rightly or wrongly) feel uneasy with 
quantum physics: it tells you what you get out of it, but not how nature does i ¥^ . 

The violation of Bell inequalities leaves very little choice: any explanation of quantum 
correlations in classical terms must involve communication. Moreover, the predictions of 
quantum theory do not vary if the choice of settings and the detection events are space- 
like separated, and several experiments have confirmed the violation of Bell inequalities 
in this configuration: therefore, the hypothetical communication would have to be su- 
perluminal. If one wants to tell "how nature does it" , the explanation must involve this 
rather problematic featur^f^. A comprehensive discussion would need to address the 
modifications of special relativity, a much debated topic that would bring us too far. 
Here, I address signaling models under the pragmatic angle of simulation: never mind 
how nature really does it, which resources would we need to simulate quantum statistics? 



5.2.1 How much? 

Let us first study the amount of communication required to simulate quantum statistics. 
One may naively guess this amount must be infinite, based on steering: in quantum 
theory, by choosing her measurement on an entangled state, Alice can prepare Bob's 
system in any state, and there are continuously many states. However, this reasoning 
fails because the simulators Anthony and Beatris^ are allowed to share some LV. In 
fact, at the moment of writing, no example of a quantum process is known, whose 
simulation would provably require an unbounded amount of communication (though 
some are conjectured). Here I present only the most famous result, due to Toner and 
Bacon [Toner and Bacon 2003] : 

Theorem 5.3. The statistics ()75|) of the singlet state under all possible von Neumann 
measurements can be simulated with local variables and one bit of communication. 

After writing this text out of my head, I was reminded that Science Magazine published a contribu- 
tion by Gisin with the title "How does nature perform the trick?" Gisin 2009 . Probably the expression 
was in my subconscious. 

^"Of course, rigorously speaking, all that one can say is that explanation must involve something 
that appears as superluminal communication in our common (3-|-l)-dimensional space-time. One can 
trade superluminal communication with other speculative features: for instance, at this level of fantasy, 
one can think that every ensemble of entangled degrees of freedom remain at the same place in some 
extra-dimension. 

Recall that I use the names of Alice and Bob for the users that choose measurement settings and 
observe outcomes. Think of Anthony and Beatrix as the mechanisms inside the boxes of Alice and Bob 
respectively. 
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Proof. The proof is constructive and I present it in the version of [Degorre, Laplante and Roland 2005 



In each run, Anthony and Beatrix share two unit vectors Aq and Ai, previously drawn 
with uniform distribution on the unit sphere . Anthony selects one of the two vectors 
according to the following rule: if |a • Ao| > |a • Ai|, he sets A — Aq; otherwise, he sets 
A = Ai. He communicates this choice to Beatrix (this is the bit of communication). 
Finally, Anthony outputs a(A) = sign(a • A), Beatrix outputs 6(A) = — sign(6 • A). 
Clearly (a) = (b) — 0, so now we need to prove that 

{ab) = / dXp{X) a{X)b{X) = -a-b. (86) 

The important piece is the effective probability distribution p(A), which is not uniform, 
because Anthony's initial selection biases A to be close to a. Concretely, we shall prove 
in Lemma [5.11 below that p{X) = ■ X\. So we have 

(ab) = — / dX — \a- A|sign(a- A) sign(6 • A) 

JS2 ZTT s ^ ' 

—a- A 

and the result is readily derived by passing in spherical coordinates chosen such that 
a = z and b = cos l3z + sin /3x. □ 

We have postponed the proof of the following 

Lemma 5.1. The selection procedure used in the proof above leads to an effective distri- 
bution p{X) — j^\a- X\. 

Proof. Consider first the following procedure, known as rejection method: 

1. Pick Aq uniformly on §^ and Uq uniformly in [0, 1]; 

2. Keep Aq = A if |a • Ao| > uq, discard it otherwise. 

The probability that a given Aq is kept is the probability that uq smaller than |a • Ao| is 
drawn; since uq is drawn uniformly, this probability is just |a-Ao|. Therefore p{X) oc |a- A|. 
By normalizing a posteriori, one finds the factor 

The problem with this procedure, as the name indicates, is that several Aq are dis- 
carded (in the context of this paper, it would amount to a detection loophole scheme in 
which Alice's box refuses to reply if her local variable does not match a desired condi- 
tion). The introduction of Ai solves the problem. Indeed, if Ai is chosen uniformly in 
Wo = |a • All is uniform in [0, 1]. So the procedure with which Anthony selects Aq is the 
rejection method. By symmetry, the procedure with which Anthony selects Ai is also 
the rejection method. Therefore the final A is distributed according to p(A) = ij^|a • A| 
as claimed, while no instance is discarded. □ 

Let me finish by a small balance. On the one hand, it is quite remarkable that the 
statistics of the singlet, that are so strongly non-classical according to several criteria 
presented in the previous sections, are only one bit away from being classical in terms 



The device-independent outlook on quantum physics 



45 



of communication. On the other hand, this and ah signahng models have an unpleasant 
taste of fine-tuninj^: indeed, the use of the bit of communication in the simulation 
above is entirely ad hoc and justified only a posteriori by the fact that it reproduces the 
quantum statistics. Worse, almost any deviation from that rule would manifest itself in 
signaling. 

5.2.2 How fast? 

In order to reproduce all the predictions of quantum theory, the superluminal communi- 
cation should in fact have infinite speed, because the entangled systems can be arbitrarily 
far apart. Remarkably, one can give a device-independent proof of this constraint, at least 
within a reasonable scenario: 

Theorem 5.4. Consider a theory that simulates quantum statistics with superluminal 
communication in a preferred frame. Then, either the speed of communication is infinite, 
or the theory must predict the possibility of sending messages faster than light ( "observable 
signaling") for some arrangement of measurements in spacetime. The conclusion can be 
based only on observed statistics and does not require the candidate preferred frame to be 
identified. 

In order to understand the theorem and how one can possibly prove such a result, we 
have to start by describing the signaling model in some detail. The desiderata are: 

(Dl) Quantum statistics are simulated by LV, supplemented by a superluminal commu- 
nication propagating at speed u < oo in a preferred frame; 

(D2) The theory does not predict the possibility for us to send a message faster than 
light (no observable signaling). 

Let £a denote the event in spacetime at which Alice chooses her measurement and 
gets her outcome (for simplicity, I suppose that these procedures take negligible time). 
Consider now a bipartite experiment. If u < cxd, three arrangements are possible: £a 
£b, £b £a, or £a and £b are outside each other's z;-cone. It is very reasonable to 
concretize (Dl) as: 

(Dla) In the arrangements £a --^ £b or £b — > £a, the observed statistics are compatible 
with a quantum state, further assumed to be the same in both casej^. 

(Dlb) If £a and £b are outside each other's u-cone, the observed statistics must be re- 
producible with LV. Notice that this arrangement never occurs if v — oo. 

^^I thank Rob Spekkens for the expression. 

^^The fact that the state is the same does not play any role in what follows, but it is what one expects 
from quantum physics, in which the time-ordering of the measurements does not change the state. This 
whole requirement is another instance of the fine-tuning of signaling models: obviously, once the Pandora 
box is opened and a signal is allowed, there is no compelling reason a priori to impose the observation 
of quantum statistics at all; but a posteriori, we know that quantum statistics are observed and we have 
set out precisely to try and simulate them. 
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Such a theory predicts a departure from quantum theory in the second arrangement and 
can therefore be tested in principle. Indeed, Ahce and Bob can first arrange £a ~^ 
and check that a Bell inequality is violated (with additional knowledge of the degree 
of freedom, they may even do a full tomography of the state). Then, Bob can bring 
his measurement to lie outside the ti-cone of £a- the observed statistics should cease 
violating Bell inequalities. The design of the experiment may not be trivial, since we do 
not know which is the preferred frame, but there does not seem to be anything a priori 
inconsistent in such a theory; in particular, it is trivial to find examples for which (D2) 
is satisfied. However, matters change when the theory is extended to three (or more) 
systems. 




Fig. 3. The two possible arrangements described in the text (I top, II bottom). Tlie tliin full 
arrows represents the superluminal influences, the dashed arrow is a normal light signal. At 
point P, A can receive classical information about the setting and outcome of C, so correlations 
between A and C can be computed; while no light signal from B could have arrived to A. 
Therefore, the correlations A-C must be the same in both arrangements, otherwise faster-than- 
light communication would become possible from B to A (simply by B deciding to perform his 
measurement earlier or later). 
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Indeed, consider the following two arrangements of three measurements at different 
locations, with C located between A and B (Fig. [3]): 

(I) £a A £5 A Sc- 

(II) £a --^ £c a-iid £b —^Sc, but £a and Sb are outside each other's u-cone; moreover, 
£c is space-like separated from both £a and £b according to the usual light cones. 

In configuration (I), one observes some quantum statistics y 2:- In configuration 
(II) , a departure from quantum statistics may happen if y violates Bell, because by 
construction Vj/ y 2. must be such that 

V]>1 y is compatible with LV . (87) 

So far, we have exploited only (Dl). Crucially, now (D2) imposes the two additional 
conditions 

V'lz = V'x.z and V'y\z = V^,^ , (88) 

for the simple reasoning explained in the caption of Fig. |21 In turn, such conditions may 
impose constraints on the possible Vj} y. Now one can hope to find a contradiction in 
the following way: start from some quantum statistics V^^ y ^ and compute Vj^ 2, ^"^^ 
Vy 2i if fl^^ the statistics Vx.y,z compatible with those marginals are such that Vx.y 
violates Bell, then no V^^ can satisfy ([57]) and (|88p. Therefore, one of the desiderata 
must be dropped. 

Such a three-partite example has not been found yet, but a four-partite example 
exploiting similar contradictions has, thus proving Theorem 15.41 That example is rather 
complex: there is little added value in reproducing it here, compared to directing the 
reader to the published paper [Bancal et al. 2012] . I'd rather present here a partial proof 
[Scarani and Gisin 2005] that gives at least some intuition — and, in the process, the 
reader can learn a nice quantum information result. 

For this sake, I strengthen (Dl) by requiring that any statistics of the theory, in 
particular V^^ , must be a quantum statistics. Also, I renounce full device-independence 
and assume that the systems are known, so tomography is possible. In this case, (j88p is 
replaced by the stronger constraint 

Pac = Pac and p'b'c = Pbc ■ (89) 
The contradiction is then based on the following 

Lemma 5.2. Consider a system (g) (g) C'^. For < a < ^, there is only one 
quantum state such that 

PAC = PBC = ^l^l)(^l| + ^IV'2>(^2| (90) 

with lipi) ~ sinajOO) +cosq;|12) and \ip2) — sinajll) +cosa|02); namely, the pure state 
l^*) — cos a M)^M.|2) -I- sin a l£22)^iii^. Moreover, this state is such that pab violates 
the CHSH inequality for cos^ ^ "Tf ' 
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Proof. Since and \ip2) are orthogonal, any purification of pAC can be written 



witli X an auxiliary mode and {Ei\E2) = 0. Further, since B is a qubit, the Schmidt 
decomposition yields 

\Ei)bx = co\0) b\xw) X + (^iW bI'^ii) X 
\E2)bx = do\0) g\x2o) X + di\l) b\x2i) X 

with {xko\xki) — 0. We can insert these expressions into j'l') and compute the expression 
for pscj then require it to be given by (l90l) . Specifically, the requirement that pBC is 
orthogonal to |01)g(^ and |10)^p forces ci — do — 0, that in turn implies cq = di — 1. 
Using this condition, one further finds that pBC can be recovered if and only if (xio |a:^2i) — 
1: therefore, \^)abcx ~ I^)abcI^)x' ^-^^ \'^)abc only quantum state, pure or 

mixed, compatible with the marginals ([TO]) . 

Now one can compute pab and use ([M]) to prove that it violates the CHSH inequality 
if cos^ a > □ 

Theorem 15.41 proves that, in order to reproduce observecF^ statistics with communi- 
cation, one has either to postulate a communication that propagates at infinite speed 
(in which case the universe is instantaneoulsy connected and everything is possible), or 
to conclude that faster-than-light signaling is possible after all (in which case not only 
quantum entanglement, but the whole of physics should be revisited). Thence this result 
comes as close as possible to a full falsification of signaling models. 

6 Towards a device-independent definition of quantum physics 

The device-independent outlook is fruitful to falsify alternative models and to promote 
Bell tests as certification tools. It is also encouraging to revisit old foundational questions, 
notably the one I present in this last section: what defines quantum physics? 

6.1 Traditional approaches to a definition of quantum physics 

The vast majority of presentations of quantum physics define it through its mathematical 
structure, something that is often referred to as "assuming the Hilbert space" . Explicitly, 
the minimal assumption for the kinematics is that every physical property P is described 
by a subspace £p of a Hilbert space H, with the rule that perfectly distinguishable prop- 
erties are associated to orthogonal subspaces. Gleason's theoreno then leads to Born's 
probability rule, from which in turn follow all those other rules that are presented as 

^^Strictly speaking, the statistics used for the proof have not been observed yet. But they come from 
a set of few von Neumann measurements on a four-qubit state, and I don't see any reason to doubt the 
accuracy of quantum predictions in such a case. 

"^^I am referring here to the original Gleason's theorem, and am extending its conclusions to the case 
d = 2 which is not covered by the proof. There exist Gleason-like theorems covering the case d = 2, 
and with much simpler proofs: the price to pay is that the assumptions (typically, the whole algebra of 
POVMs) become even harder to justify. 
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"axioms" in some introductory textbooks; the unitary representation of symmetries is 
another theorem, Wigner's. For the dynamics, the independent assumption of reversibil- 
ity is required. 

This approach is exceUent for practical purposes, clarifying the mathematical struc- 
ture that is accepted by everyone as a working tool. One can also accept this mathemat- 
ical structure a posteriori, based on its predictive power. Nevertheless, it is legitimate 
to ask if the Hilbert space assumption can be replaced by something more appealing a 
priori. Even a simple review of the approaches that have been proposed would take us 
too long. But, in a way or another, they all assume the possibility of characterizing the 
state: in other words, they assume that one can identify a closed set of measurements, 
such that the state is defined completely by the statistics of those measurements. As 
a consequence, before applying the formalism to a concrete case, one needs to have a 
pretty good idea of the degree of freedom under study and of the measurement devices 
that are in principle available. 

All this is perfectly legitimate and common practice in physics. However, from the 
vantage point of device- independent assessment, we may hope to do even better. The 
hierarchy of semi-definite criteria described in paragraph l4.3.2l goes only half way: it does 
define the possible physical observations in terms of a device-independent criterion (the 
positivity of some matrices built only on observed statistics); but there is no justification 
for this criterion, other than the fact that it recovers the statistics achievable with the 
Hilbert space formalism. Waiting for someone to find a physical reason, independent 
of quantum physics, why those matrices should be positive, I review here the partial 
successes achieved in answering the question: can quantum physics be defined in terms 
of device-independent physical principles? 

6.2 No-signaling statistics 

6.2.1 No-signaling as a framework 

As we have argued above, in order to simulate quantum statistics, we would need to use 
communication; but quantum physics achieves the same result without communication. 
It is tempting therefore to postulate "no-signaling through observation" as a physical 
principle that must be respected in nature. This no-signaling principle comes quite close 
to defining quantum physics itself, by cutting out all the models based on communication 
(which are unconstrained as for the statistics they can distribute) . Popescu and Rohrlich 
[Popescu and Rohrlich 1994] went further and asked whether the no-signaling principle 
defines quantum physics tightly. They found a counter-example (see next paragraph), so 
we need to find a more refined principle, or maybe a set of such principles. But all the 
following discussion will have the set of no-signaling statistics as underlying framework. 

The set of no-signaling statistics is a polytope, since it is embedded in a finite- 
dimensional space and is defined by the linear constraints (jG)). Its extremal points 
are the local deterministic points (the same as for the local polytope) and some non- 
deterministic points. For the purpose of this text, we do not need to spend more time 
in these mathematical charcaterizations: I shall just present the simplest example of 
extremal no-signaling point that is not achievable with quantum statistics. It is the very 
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example given by Popescu and Rohrlich and is nowadays generally callecP^ PR-box. 

6.2.2 The PR-box 

We focus on the CHSH scenario of two parties, two inputs and two outputs. In paragraph 
12.4.21 we have noticed that the correlation vector w = (+1, +1, +1, —1) would reach the 
algebraic maximum S" = 4 of CHSH; shortly later, we have established the Tsirelson 
bound S = 2V2. This means that the correlations w, which can be compactly written as 

a(Bb = xy for a,b,x,y £ {0,1} , (91) 

cannot be distributed with quantum physics. 

There are four deterministic points that achieve w: 
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But each of these deterministic points violates the no-signaling constraint. Consider for 
instance Bob can directly read Alice's input by choosing y — 1, since 6^=1 = x. It is 
easy to check that only one convex combination of these points satisfies the no-signaling 
condition ©, namely the PR-box 

VPR : P(a,&|0,0) = P(a,6|G,l) = P(a,6|l,0) = i<5a=o,6=o + ^<5a=i,b=i, (92) 
P(a, b\l, 1) = i<5a=o,6=i + ^Sa=i,b=o ■ (93) 

This uniqueness, together with the fact that S* = 4 is the maximum CHSH can reach, 
immediately implies that VpR is an extremal point of the no-signaling polytope. 

In the last decade or so, intriguing results have been obtained by considering the 
PR-box and its generalization as resources for distributing correlations: the interested 
reader will find basic information and references in [Brunner et al. 20131 IScarani 2006] . 
In the following, I shall use the PR-box simply as the prototypical example of no-signaling 
statistics that cannot be achieved with quantum physics. 



*®When it comes to matters of priority, it had been mentioned in the literature as early as 1985 by 
other authors, see [Brunner et al. 2013| for the details. 
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6.3 Device-independent physical principles 

6.3.1 Information causality 

Information causality (IC) may be the physical principle that defines quantum physics, 
but we have not been able to prove it yet; for sure, it is the attempt that comes closer 
to selecting the quantum set within the no-signaling polytope. Here, I limit myself to 
presenting the initial intuition, because Marcin Pawlowski (the one who had first the 
idea) and I have recently written a synthetic text [Pawlowski and Scarani 2013| which 
is, in my opinion, as clear as it getf0- 





Fig. 4. The task that inspires information causality and how the PR-box achieves it. 



It all starts by finding something that would "go wrong" if PR-boxes would exist. 
The starting point to formulate IC is the power of the PR-box in the communication 
taslQ described in Fig. 01 Ahce's input consists of a pair of bits {xo,xi) G {0, 1}^ drawn 
uniformly at random among the four possible values; Bob's input is a bit y £ {0, 1} 
unknown to Alice. The goal of the channel is to give Bob Xy without giving him any 
knowledge of a::i_y. 

Let us pause to examine this situation classically. If Alice cannot send any information 
to Bob, obviously Bob cannot retrieve either bit. If Alice can send two bits. Bob can 
trivially retrieve both Xy and xi-y. The interesting case is when Alice is restricted to 
send only one bit: may they succeed in the task, possibly with the help of pre-established 
LV? One can prove that they can't: the best Alice can do with one bit of communication 
is the obvious strategy: she encodes by default xq, sends it, and Bob outputs it. If Bob 
had received y = 0, his output is correct; if he received ?/ = 1, his output is uncorrelated 
with the right answer xi. 

Remarkably though, Alice and Bob would succeed if, instead of pre-sharing LV, they 
would be allowed to pre-share a PR-box. Indeed, here goes the protocol: Alice inputs 
X = xq G) xi in the PR-box; she gets the outcome a and sends to Bob the single-bit 

^'^I am still able to use the copy-and-paste function of my computer, but I don't see any point in using 
it here. 

*®This taks is known in information science as the simplest example of "random access code"; or, if 
one adds the requirement that Alice is forbidden to know Bob's choice even a posteriori, as "oblivious 
transfer" . 
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message m = a (S xq. Bob inputs y in the PR-box, gets the outcome b and produces the 
guess P — m (B b — {a (B b) (B Xq- By the rule (f9T|) of the PR-box, a ® 6 = {xq (B xi)y; so 
P — Xq (B [{xo ffi xi)y] — Xy as claimed. 

Nothing has gone blatantly wrong: Alice sends one bit. Bob gets one bit, because the 
PR-box is a no-signaling resource. It is nevertheless puzzling that Bob can guess perfectly 
either of the two bits: it looks as if both bits had been transferred to his location, even if 
he is allowed to read only one. Information causality erects as a principle that this should 
not happen: positively, if Alice sends one bit, the amount of useful information at Bob's 
location cannot exceed one bit [Pawlowski et al. 2009j . When formulated mathematically, 
it turns out that IC is respected if Alice and Bob would share entanglement. For the 
CHSH scenario, IC is violated by any no-signaling resource such that S > 2-\/2: in other 
words, the Tsirelson bound is recovered without any reference to the algebra of Hilbert 
spaces. 



6.3.2 Macroscopic locality 

Like information causality, macroscopic locality [Navascues and Wunderlich 2009) is de- 
fined in terms of a restricted task. The scenario is that of a normal Bell experiment, with 
a source producing i.i.d. signals that would generate the "microscopic" statistics 'Px,y- 
However, the boxes are not capable of measuring individual events and reconstruct those 
statistics: for each choice of their measurement settings, they are restricted to observe 
only "macroscopic" averages. 

Specifically, for any x, Alice can access only the currents 1^ — {Ia=o\xi Ia=i\xT ■■■) 
created in her niA detectors by sending N signals. Similarly, Bob can access Jy ~ 
( Jf)=o|y, >/f)=i|y, •■•) defined in an identical way. This defines a scenario with the same 
number of settings as the microscopic one but a much larger alphabet of outcomej^. 
After repeating such an experiment many times, Alice and Bob can reconstruct the 
statistics of these macroscopic currents 

'Px[y = {P{Ix,3y), xeX,y€y} . (94) 

The principle of macroscopic locality states that V^^y should not violate any Bell in- 
equality in the limit N ^ oo. The rationale is that coarse-graining should lead to classical 
physics, an intuition to which many would unhesitatingly subscribj^. 

It is easy to show that the PR box violates macroscopic locality. The macroscopic 
statistics are given by 

r ix,y) = iO,0) : Io=Jo 

rPR boxl • J (^'y) = (0'l) Io=Ji (or) 
/^^^ [PR box] . < (2.^y)^(i^o) : Ii=Jo ■ ^ ' 

[ (a;,y) = (l,l) : Ii = (/, iV - /) , Ji = (A^ _ /, /) 

*^For fixed finite A^, Ix takes values in N™-* restricted by /^j^ = A''; and similarly for Jy. But 
since we are going to consider the limit A'^ — > oo, the discrete and finite structure won't play any role. 

^'^A point of logic: in order to accept the principle, it is enough to consider coarse-graining at the 
measurement device as a possible path for the emergence of the classical world. It does not need to be 
believed as the only mechanism. 
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In order to show that these statistics violate some Bell inequality, we can append the 
following local post-processing (majority vote): for each run of the macroscopic experi- 
ment, Alice's box outputs a = if Ia=o\x ^ Ia=i\x > and a — 1 otherwise; Bob's box 
outputs /3 with the same rule. Then, the statistics P{a, I3\x, y) thus obtained define again 
the PR box. Loosely speaking, this proves that the PR box gets out unscathed from the 
coarse-graining process and therefore grossly violates macroscopic locality. 

The extent to which the principle of macroscopic locality approximates the quantum 
set Q is known exactly: 

Theorem 6.1. V^^y can be reproduced by LV if and only ifVx,y belongs to Qi, the set of 
statistics that pass the first test of the hierarchy of semi- definite criteria. In particular, 
since Q G Qi, all quantum statistics respect macroscopic locality, but there are non- 
quantum statistics that respect it as well. 

Proof. As shown in paragraph 12.2.21 V^y can be reproduced by LV if and only if 
each of the P{lx,3y) can be computed as marginal of a joint probability distribution 

Now, notice that 

N N 

Ia\x = ^ Sa^(n)=a , Jb\y — ^ '5b^(n)=f, (96) 
n— 1 n—1 

are sums of i.i.d. random variables. Each being a vector of mA numbers Ia\x f^nd each 
3y being a vector of tob numbers they are in turn sums of i.i.d. random vectors. 
If P exists, the central limit theorem states that the fluctuations of its variables around 
their average must obey, in the limit iV — >■ oo, a multivariate Gaussian distribution with 
zero average and covariance matrix F > 0. Inversely, if such a distribution exists, it 
defines a valid P. 

The proof will be finished by showing that F is essentially the matrix M that defines 
Qi for Vx,y, as defined in paragraph l4.3.2l Indeed, let's define the fiuctuations as 

J. Ia\x ~ {Ia\x} „ _ Jb\y ^ {Jb\y) ,„„x 

^"1^ = Vn ' ■ ^^^^ 



the entries of the covariance matrix are the Fy = (f^ • fj), i.e. with a suitable labeling 

(98) 



{{ix-fx')} 






{(f,-f.'>} 







The elements of the off-diagonal blocks are terms of the form 

{fa\xfb\y) = j;^{{Ia\xJb\y) - {Ia\x){Jb\y}) 

""-^ P{a,b\x,y)-P{a\x)P{b\y). (99) 

The terms of the diagonal blocks have a similar form, but can be associated to observable 
probabilities only when x = x' {y = y'). This matrix defines indeed the step Qi of the 
hierarchy, up to redefining the measurement operators as Fa\x = Ilj^ — {n^)tdA and 
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It is important to stress the crucial role of the i.i.d. assumption, which is required in 

the proof above in order to invoke the central limit theorem: 

• If i.i.d. is not requested, quantum states that violate a Bell inequality for the coarse- 
grained detection can be found (although, for large N, they may be impossible to 
realize in practice). So, the theorem does not say that coarse-graining alone washes 
all Bell violation away. 

• Under the promise that their source is producing i.i.d. signals, Alice and Bob 
can infer Vx,y from the covariance matrix of their observed macroscopic statistics. 
Thus, they may infer that their source could be used to violate some Bell inequality, 
if fine-grained detection would be available. 

6.3.3 A temporary balcince 

Let me finish by summarizing where we stand in the quest for a complete device- 
independent definition of quantum physics. 

One path towards may pass through finding physical interpretations for the steps 
Qn of the NPA hierarchy (so far, we know only the physical meaning of Qi). However, 
this hierarchy is just the only family of tests for which convergence has been proved: 
nothing guarantees that all its steps must have a clear physical meaning. In particular, 
information causality does not correspond to any step of that hierarchy and, insofar as 
we know, a suitable generalization of it may already define the quantum set exactly. 

7 Conclusion 

Set in a science museum, in a not too distant future: 

- "Mum, what arc those two boxes?" 

- "They show the violation of Bell inequalities. You remember when you went with dad 
to buy pieces for the new quantum computer and he quarreled with the vendor? The 
fellow was trying to sell boxes of low quality, but your dad knows these things: nobody 
can cheat him." 

- "The small shiny boxes?" 

"Yes. nowadays they are very small. This one here is the old kind, the one we had 
when I was a student. I played a bit with such boxes. They changed the way I look at 
nature." 

- "Wow! How does it work?" 

- "Nobody really knows." 
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A Reading EPR again 

In most texts devoted to Bell inequalities, including this one, the original EPR paper 
[Einstein, Podolski and Rosen 1935] is quoted only at the beginning and never mentioned 
again. In this appendix, I revisit the physical process proposed by EPR in the light of 
our present understanding. 

The EPR state is a bipartite state of two particles on a line, each characterized by 
the Hilbert space L^(R). It is immediate to check that the operators xi — X2 and pi +p2 
commute; so one can define the state that satisfies botlF^ 

xi — X2 = d and pi + P2 = u (100) 

with d,u E R. Explicitly, this state is such that: 

• If position is measured, particle 1 can be found anywhere and particle 2 will be 
found at X2 ~ xi — d. If d = 0, the particles are found at the same location, but 
this value does not play any role in the argument. 

• If momentum is measured, particle 1 can be found with any value and particle 2 
will be found to have p2 = u — pi. If u = 0, the particles have opposite momentum, 
but again this does not play any role. 

The EPR reasoning is the following: if I measure the position of particle 1 and find 
xi = a;, I know for sure what the result of a measurement of position of particle 2 would 
be, namely X2 = x—d. So I can just as well learn something else by measuring momentum 
of particle 2: upon finding p2 = p,l know that a measurement of momentum on particle 
1 would have given pi — u ~ p. Notice that this does not contradict the uncertainty 
relationj^. 

Our understanding of LV statistics is a powerful tool to analyze this reasoning: in fact, 
in a sense, we have already analyzed it. Indeed, we can rephrase it with the singlet state 

^^One can as well study the case where x\ + X2 and pi — p2 are used to define the state. 

^^The predictions for the EPR state, like any state in quantum theory, obey the uncertainty relations: 
for instance, the distribution of pi = u — p2 is uniformly spread on R, even post-selecting on the runs in 
which one finds a given value xi = x. Indeed, AxAp > ^ means that one cannot prepare a source, such 
that the statistics of both position and momentum are sharply defined. This does not imply logically 
that both position and momentum cannot be sharply defined in each single run. Admittedly, if position 
and momentum were well-defined in each run, it would be hard to understand why nature conspires to 
hide this from us over many runs; nevertheless, the existence of LV models for measurements on single 
degrees of freedom cannot be denied, however "unnatural" one may find them. 
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of two qubits: if I measure CTz on Alice's qubit and find +1, I know that a measurement 
of az on Bob's qubit would have hold —1. So I can just as well measure ax on Bob's 
qubit, etc. We have seen in paragraph [231 that the statistics of these measurements can 
be reproduced with LV; and so can the statistics of a measurement on a single particle. 
Therefore, my assessment is that EPR drew the right conclusion, based on their limited 
evidence: the existence of pre-determined values for single particles cannot be excluded, 
and the EPR reasoning even enforces this as the most reasonable explanation of some 
other predictions of quantum theory! Only the violation of Bell inequalities can shatter 
this vision. 

As it turns out, EPR were not on the simplest track to formulate the problem in the 
right way: 

• The mental image of the experiment is complex: particle 1(2) is not "the particle 
that will be measured in the localized region called Alice's (Bob's) lab" . There must 
be another label to identify the subsystems, for instance the type (particle 1 is a 
proton, particle 2 an electron), or the state of an internal degree of freedom (particle 
1 is spin up, particle 2 is spin down). Here we can appreciate the contribution of 
Bohm, who made it possible to envisage observations in space-like separated labs 
by describing entanglement in internal degrees of freedom rather than in position 
and momentum. 

• Even if EPR had been aware of the inequalities and had tried to violate them, they 
were handling one of the most difficult states for the task. Indeed, the EPR state 
has a positive Wigner function: so, as long as both particles undergo arbitrarily 
many measurements of the type cos 6x+sin 9p, there exist a LV model that describes 
all the statistics. In order to violate Bell inequalities, the theorist may rewrite the 
EPR state in the basis of eigenstates of the harmonic oscillator, thus going back to 
the discrete-variable formalism in which we proved Gisin's theorem. 

Ultimately, I think that wc must admit the evidence: EPR had a great intuition and 
pinpointed a crucial feature of quantum physics, but it took the work of Bell to put this 
intuition in its suitable conceptual framework. 

B The tortuous path to device-independence 

I have presented the notion of device-independent assessment as a natural consequence 
of the meaning of Bell inequalities — and such, I am convinced, it is. However, as it 
often happens, we human beings need some time to straighten our thoughts. Here I am 
going to describe how we reached there. 

B.l Prehistory 

No scientific idea is born out of nothing: there are hints, precursors, that someone may 
want to call "lost opportunities" but really just prove that ground must be broken before 
a seed can become a tree. 
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The fact that S = 2^/2 identifies the singlet up to local isometries was proved as early 
as 1992 [Popescu and Rohrlich 1992b[ [Braunstein, Mann and Revzen 1992| , as the an- 
swer to the question "which quantum states violate CHSH maximally" . Nobody thought 
of turning it around and using CHSH to certify the presence of those states. Curiously, 
when someone had that idea, they did not use Bell inequalities: Mayers and Yao pro- 
posed their own scheme in 1998 and invented the term "self-testing" for the occasion 
[Mayers and Yao 1998, 2004| . The Mayers- Yao papers contained all it gets to ignite the 
revolution, including the motivation by quantum cryptography; but failed to. I can only 
try and guess why: quantum cryptography, it was still in its very early days. Not many 
could follow Mayers' very complicated proof of "unconditional security" that assumed 
qubits and well-defined commutation relations. It is safe to guess that even fewer people 
would embark in studying an equivalently complicated proof of self-testing apparatuses! 

The Vienna-Gdansk collaboration were among those who, like me, were still clinging 
to the hope that Bell was not a piece for a museum. In 2004 they published a paper re- 
lating Bell inequalities to communication complexity games [Brukner et al. 2004] . With 
hindsight, the approach is a bit artificial: the games were defined in terms of Bell inequal- 
ities, nobody would have invented them otherwise. Still, I can't refrain form quoting the 
last sentence of the abstract: "Thus, violation of Bell's inequalities has a significance 
beyond that of a non-optimal-witness of non-separability" . 

As for myself, the first contact with device- independence dates from a visit that 
Nicolas Gisin and I were paying to Sandu Popescu in Bristol, probably in 2002. Nicolas, 
who was starting to go commercial with quantum cryptography, was concerned about 
certification. We were having a moderately calm discussion about this topic, Sandu being 
notoriously not excited by those practical developments and participating more out of 
friendship. The conclusion was simple: the ultimate certification must come from Bell 
inequalities. There and then, this conclusion, obvious for anyone who has understood 
Bell, did not seem particularly deep to us. We had simply in mind a two-step procedure, 
in which one would first test the quantumness of the device using Bell, then run a usual 
protocol: not an idea to write a single paper about. Why did we not think further? 
Nicolas and I just don't know, Sandu has probably forgotten the incident. 

B.2 Making history 

The turning point seems to be the year 2004. Quantum cryptography had grown to 
a quite mature field, with several experimental groups improving their technologies and 
theorists harnessing the techniques of security proofs. Also, some more abstractly-minded 
theorists had started playing with PR-boxes and their generalizations. In this context, 
Barrett, Hardy and Kent had a remarkable insight: maybe one can prove security based 
only on no-signaling. This would mean that "quantum" cryptography would actually re- 
main secure in all post-quantum theories that do not allow signaling by mere observation. 
They found the first example of such protocol [Barrett, Hardy and Kent 2005| . 

Nicolas Gisin, who was my boss at that time, took their result very seriously and 
started working on it with Toni Acm and Lluis Masanes from Barcelona. Among other 
things, they noticed something that the whole quantum information community had got 
wrong for years. By common knowledge, the BB84 and the Ekert protocols for quantum 
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cryptography were considered equivalent, one being just an entanglement-based version 
of the other. But this is true only under the assumption that one is using two-qubit 
states and mutually unbiased measurements. In reality, the Ekert protocol can provide 
device- independent security, because it is based on CHSH; while the security of BB84 
and its entanglement-based version BBM is fully compromised in a device-independent 
scenario, because (as we mentioned in the text) its perfect statistics can be reproduced 
with LV with a sufficiently large alphabetic. 

I had not been part of the discussions in reason of a momentary estrangement from 
Nicolas — he had mentioned this research to me, but manifestly I had other things in 
mind. A few days after I came back in focus, Nicolas shared with me the first draft of 
what I realized would become a milestone paper |Acm, Gisin and Masanes 2006| . I was 
enraged with myself, it was too late to do anything to add my name there: fortunately, 
I managed to give a positive turn to my rage and, in a few weeks, had produced most of 
the generalizations that appear in the full-length paper [Scarani et al. 2006| . 

At the same time, Toni shared with me what he would consider as the next goal: 
to prove device-independent security against a quantum adversary. Indeed, security 
against no-signaling adversaries is conceptually appealing, but does not seem to be an 
urgent concern; moreover, it gives quite bad bounds. On the contrary, security against 
a "normal" quantum adversary would be highly relevant for blind certification: nothing 
else than the idea Nicolas and I had discussed in Bristol, but reloaded in a fully new 
context. This time, I did not miss the chance, joined the effort from the start and ended 
up producing the core of the proof together with Serge Massar |Acm et al. 2007j . This is 
the first paper in which the wording "device-independent" is actually used: if my memory 
does not betray me, the term must be attributed to Toni. 

B.3 Developments 

Our collaboration has had the good taste of leaving some assumptions in the security 
proofs: quite a number of prominent researchers worked for some years to close that gap. 
In the mean-time, I had had the idea of source certification and was introduced to the 
works of Mayers and Yao; eventually, Matthew McKague would crack the self-testing 
code by developing a simple version of the proofs, as we have seen. The idea of device- 
independent randomness certification, a priori much simpler and more straightforward 
than cryptography, was brought forward only later: the merit must be shared between 
Roger Colbeck [Colbcck and Kent 2011^ and my previous co-authors Toni, Serge and 
Stefano Pironio [Pironio et al. 2010) . 

The device-independent program has also boosted the experimental quest for closing 
the detection loophole, something that was previously considered anecdotic at best (in- 
deed, nobody had ever taken seriously a conspiracy of the detectors, but everyone should 
doubt the behavior of a black box). There are hints in the literature and in the corridors 
of conferences that the loophole-free Bell test is upcoming. 

All these developments can be gathered from [Brunner et al. 2013j or by browsing 
the arXiv. 



^^It took some time for this obvious statement to be digested by those who had built their careers on 
proving that BB84 is "unconditionally secure" . 
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